doi:https://doi.org/

ESSI – Earth & Space Science Informatics

ESSI1 – Next-Generation Analytics for Scientific Discovery: Data Science, Machine Learning, AI

EGU23-2843 | ECS | PICO | ESSI1.1

Geography-Aware Masked Autoencoders for Change Detection in Remote Sensing

Lukas Kondmann, Caglar Senaras, Yuki M. Asano, Akhil Singh Rana, Annett Wania, and Xiao Xiang Zhu

Increasing coverage of commercial and public satellites allows us to monitor the pulse of the Earth in ever-shorter frequency (Zhu et al., 2017). Together with the rise of deep learning in artificial intelligence (AI) (LeCun et al., 2015), the field of AI for Earth Observation (AI4EO) is growing rapidly. However, many supervised deep learning techniques are data-hungry, which means that annotated data in large quantities are necessary to help these algorithms reach their full potential. In many Earth Observation applications such as change detection, this is often infeasible because high-quality annotations require manual labeling which is time-consuming and costly.

Self-supervised learning (SSL) can help tackle the issue of limited label availability in AI4EO. In SSL, an algorithm is pretrained with tasks that only require the input data without annotation. Notably, Masked Autoencoders (MAE) have shown promising performances recently where a Vision Transformer learns to reconstruct a full image with only 25% of it as input. We hypothesize that the success of MAEs also extends to satellite imagery and evaluate this with a change detection downstream task. In addition, we provide a multitemporal DINO baseline which is another widely successful SSL method. Further, we test a second version of MAEs, which we call GeoMAE. GeoMAE incorporates the location and date of the satellite image as auxiliary information in self-supervised pretraining. The coordinates and date information are passed as additional tokens to the MAE model similar to the positional encoding.
The pretraining dataset used is the RapidAI4EO corpus which contains multi-temporal Planet Fusion imagery for a variety of locations across Europe. The dataset for the downstream task also uses Planet Fusion in pairs as input data. These are provided on a 600m * 600m patch level three months apart together with a classification if the respective patch has changed in this period. Self-supervised pretraining is done for up to 150 epochs where we take the model with the best validation performance on the downstream task as a starting point for the test set.

We find that the regular MAE model scores the best on the test set with an accuracy of 81.54% followed by DINO with 80.63% and GeoMAE with 80.02%. Pretraining MAE with ImageNet data instead of satellite images results in a notable performance loss down to 71.36%. Overall, our current pretraining experiments can not yet confirm our hypothesis that GeoMAE is advantageous compared to regular MAE. However, in similar spirit, Cong et al. (2022) recently introduced SatMAE which outlines that for other remote sensing applications, the combination of auxiliary information and novel masking strategies is a key factor. Therefore, it seems that a combination of location and time inputs together with adapted masking may also hold the most potential for change detection. There is ample potential for future research in geo-specific applications of MAEs and we provide a starting point for this with our experimental results for change detection.

How to cite: Kondmann, L., Senaras, C., Asano, Y. M., Rana, A. S., Wania, A., and Zhu, X. X.: Geography-Aware Masked Autoencoders for Change Detection in Remote Sensing, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2843, https://doi.org/10.5194/egusphere-egu23-2843, 2023.

EGU23-3267 | ECS | PICO | ESSI1.1

Decomposition learning based on spatial heterogeneity: A case study of COVID-19 infection forecasting in Germany

Ximeng Cheng, Jost Arndt, Emilia Marquez, and Jackie Ma

New models are emerging from Artificial Intelligence (AI) and its sub-fields, in particular, Machine Learning and Deep Learning that are being applied in different application areas including geography (e.g., land cover identification and traffic volume forecasting based on spatial data). Different from well-known datasets often used to develop AI models (e.g., ImageNet for image classification), spatial data has an intrinsic feature, i.e., spatial heterogeneity, which leads to varying relationships across different regions between the independent (i.e., the model input X) and dependent variables (i.e., the model output Y). This makes it difficult to conduct large-scale studies with a single robust AI model. In this study, we draw on the idea of modular learning, i.e., to decompose large-scale tasks into sub-tasks for specific sub-regions and use multiple AI models to achieve these sub-tasks. The decomposition is based on the spatial characteristics to ensure that the relationship between independent and dependent variables is similar in each sub-region. We explore this approach for forecasting COVID-19 cases in Germany using spatiotemporal data (e.g., weather data and human mobility data) as an example and compare the prediction tasks with a single model to the proposed decomposition learning procedure in terms of accuracy and efficiency. This study is part of the project DAKI-FWS which is funded by the Federal Ministry of Economic Affairs and Climate Action in Germany to develop an early warning system to stabilize the German economy.

How to cite: Cheng, X., Arndt, J., Marquez, E., and Ma, J.: Decomposition learning based on spatial heterogeneity: A case study of COVID-19 infection forecasting in Germany, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3267, https://doi.org/10.5194/egusphere-egu23-3267, 2023.

EGU23-4929 | PICO | ESSI1.1

Using AI and ML to support marine science research

Ilaria Fava, Peter Thijsse, Gergely Sipos, and Dick Schaap

The iMagine project is devoted to developing and delivering imaging data and services for aquatic science. Started in September 2022, the project will provide a portfolio of image data collections, high-performance image analysis tools empowered with Artificial Intelligence, and best practice documents for scientific image analysis. These services and documentation will enable better and more efficient processing and analysis of imaging data in marine and freshwater research, accelerating our scientific insights about processes and measures relevant to healthy oceans, seas, and coastal and inland waters. By building on the European Open Science Cloud compute platform, iMagine delivers a generic framework for AI model development, training, and deployment, which researchers can adopt for refining their AI-based applications for water pollution mitigation, biodiversity and ecosystem studies, climate change analysis and beach monitoring, but also for developing and optimising other AI-based applications in this field. The iMagine AI development and testing framework offers neural networks, parallel post-processing of extensive data, and analysis of massive online data streams in distributed environments. The synergies among the eight aquatic use cases in the project will lead to common solutions in data management, quality control, performance, integration, provenance, and FAIRness and contribute to harmonisation across RIs. The resulting iMagine AI development and testing platform and the iMagine use case applications will provide another component to the European marine data management landscape, valid for the Digital Twin of the Ocean, EMODnet, Copernicus, and international initiatives.

How to cite: Fava, I., Thijsse, P., Sipos, G., and Schaap, D.: Using AI and ML to support marine science research, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4929, https://doi.org/10.5194/egusphere-egu23-4929, 2023.

EGU23-6818 | ECS | PICO | ESSI1.1

Eddy identification from along-track altimeter data with multi-modal deep learning

Adili Abulaitijiang, Eike Bolmer, Ribana Roscher, Jürgen Kusche, and Luciana Fenoglio-Marc

Eddies are circular rotating water masses, which are usually generated near the large ocean currents, e.g., Gulf Stream. Monitoring eddies and gaining knowledge on eddy statistics over a large region are important for fishery, marine biology studies, and testing ocean models.

At mesoscale, eddies are observed in radar altimetry, and methods have been developed to identify, track and classify them in gridded maps of sea surface height derived from multi-mission data sets. However, this procedure has drawbacks since much information is lost in the gridded maps. Inevitably, the spatial and temporal resolution of the original altimetry data degrades during the gridding process. On the other hand, the task of identifying eddies has been a post-analysis process on the gridded dataset, which is, by far, not meaningful for near-real time applications or forecasts. In the EDDY project at the University of Bonn, we aim to develop methods for identifying eddies directly from along track altimetry data via a machine (deep) learning approach.

Since eddy signatures (eddy boundary and highs and lows on sea level anomaly, SLA) are not possible to extract directly from along track altimetry data, the gridded altimetry maps from AVISO are used to detect eddies. These will serve as the reference data for Machine Learning. The eddy detection on 2D grid maps is produced by open-source geometry-based approach (e.g., py-eddy-tracker, Mason et al., 2014) with additional constraints like Okubo-Weiss parameter. Later, Sea Surface Temperature (SST) maps of the same region and date (also available from AVISO) are used for manually cleaning the reference data. Noting that altimetry grid maps and SST maps have different temporal and spatial resolution, we also use the high resolution (~6 km) ocean modeling simulation dataset (e.g., FESOM, Finite Element Sea ice Ocean Model). In this case, the FESOM dataset provides a coherent, high-resolution SLA and SST, salinity maps for the study area and is a potential test basis to develop the deep learning network.

The single modal training via a Conventional Neural Network (CNN) for the 2D altimetry grid maps produced excellent dice score of 86%, meaning the network almost detects all eddies in the Gulf Stream, which are consistent with reference data. For the multi-modal training, two different training networks are developed for 1D along-track altimetry data and 2D grid maps from SLA and SST, respectively, and then they are combined to give the final classification output. A transformer model is deemed to be efficient for encoding the spatiotemporal information from 1D along track altimetry data, while CNN is sufficient for 2D grid maps from multi-sensors.

In this presentation, we show the eddy classification results from the multi-modal deep learning approach based on along track and gridded multi-source datasets for the Gulf stream area for the period between 2017 and 2019. Results show that multi-modal deep learning improve the classification by more than 20% compared to transformer model training on along-track data alone.

How to cite: Abulaitijiang, A., Bolmer, E., Roscher, R., Kusche, J., and Fenoglio-Marc, L.: Eddy identification from along-track altimeter data with multi-modal deep learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6818, https://doi.org/10.5194/egusphere-egu23-6818, 2023.

EGU23-8479 | ECS | PICO | ESSI1.1

Model evaluation strategy impacts the interpretation and performance of machine learning models

Lily-belle Sweet, Christoph Müller, Mohit Anand, and Jakob Zscheischler

Machine learning models are able to capture highly complex, nonlinear relationships, and have been used in recent years to accurately predict crop yields at regional and national scales. This success suggests that the use of ‘interpretable’ or ‘explainable’ machine learning (XAI) methods may facilitate improved scientific understanding of the compounding interactions between climate, crop physiology and yields. However, studies have identified implausible, contradicting or ambiguous results from the use of these methods. At the same time, researchers in fields such as ecology and remote sensing have called attention to issues with robust model evaluation on spatiotemporal datasets. This suggests that XAI methods may produce misleading results when applied to spatiotemporal datasets, but the impact of model evaluation strategy on the results of such methods has not yet been examined.

In this study, machine learning models are trained to predict simulated crop yield, and the impact of model evaluation strategy on the interpretation and performance of the resulting models is assessed. Using data from a process-based crop model allows us to then comment on the plausibility of the explanations provided by common XAI methods. Our results show that the choice of evaluation strategy has an impact on (i) the interpretations of the model using common XAI methods such as permutation feature importance and (ii) the resulting model skill on unseen years and regions. We find that use of a novel cross-validation strategy based on clustering in feature-space results in the most plausible interpretations. Additionally, we find that the use of this strategy during hyperparameter tuning and feature selection results in improved model performance on unseen years and regions. Our results provide a first step towards the establishment of best practices for model evaluation strategy in similar future studies.

How to cite: Sweet, L., Müller, C., Anand, M., and Zscheischler, J.: Model evaluation strategy impacts the interpretation and performance of machine learning models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8479, https://doi.org/10.5194/egusphere-egu23-8479, 2023.

EGU23-9437 | PICO | ESSI1.1

On Unsupervised Learning from Environmental Data

Mikhail Kanevski

Predictive learning from data usually is formulated as a problem of finding the best connection between input and output spaces by optimizing well-defined cost or risk functions.

In geo-environmental studies input space is usually constructed from the geographical coordinates and features generated from different sources of available information (feature engineering), by applying expert knowledge, using deep learning technologies and taking into account the objectives of the study. Often, it is not known in advance if the input space is complete or contains redundant features. Therefore, unsupervised learning (UL) is essential in environmental data analysis, modelling, prediction and visualization. UL also helps better understand the data and phenomena they describe as well as in interpreting/communicating modelling strategies and the results in the decision-making process.

The main objective of the present investigation is to review some important topics in unsupervised learning from environmental data: 1) quantitative description of the input space (“monitoring network”) structure using global and local topological and fractal measures, 2) dimensionality reduction, 3) unsupervised feature selection and clustering by applying a variety of machine learning algorithms (kernel-based, ensemble learning, self-organizing maps) and visualization tools.

Major attention is paid to the simulated and real spatial data (pollution, permafrost, geomorphological and wind fields data). Considered case studies have different input space dimensionality/topology and number of measurements. It is confirmed that UL should be considered an integral part of a generic methodology of environmental data analysis. Comprehensive comparisons and discussions of the results conclude the research.

How to cite: Kanevski, M.: On Unsupervised Learning from Environmental Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9437, https://doi.org/10.5194/egusphere-egu23-9437, 2023.

EGU23-11601 | PICO | ESSI1.1

Clustering Geodata Cubes (CGC) and Its Application to Phenological Datasets

Francesco Nattino, Ou Ku, Meiert W. Grootes, Emma Izquierdo-Verdiguier, Serkan Girgin, and Raúl Zurita-Milla

Unsupervised classification techniques are becoming essential to extract information from the wealth of data that Earth observation satellites and other sensors currently provide. These datasets are inherently complex to analyze due to the extent across multiple dimensions - spatial, temporal, and often spectral or band dimension – their size, and the high resolution of current sensors. Traditional one-dimensional cluster analysis approaches, which are designed to find groups of similar elements in datasets such as rasters or time series, may come short of identifying patterns in these higher-dimensional datasets, often referred to as data cubes. In this context, we present our Clustering Geodata Cubes (CGC) software, an open-source Python package that implements a set of co- and tri-clustering algorithms to simultaneously group elements across two and three dimensions, respectively. The package includes different implementations to most efficiently tackle datasets that fit into the memory of a single machine as well as very large datasets that require cluster computing. A refining strategy to facilitate data pattern identification is also provided. We apply CGC to investigate gridded datasets representing the predicted day of the year when spring onset events (first leaf, first bloom) occur according to a well-established phenological model. Specifically, we consider spring indices computed at high spatial resolution (1 km) and continental scale (conterminous United States) for the last 40+ years and extract the main spatiotemporal patterns present in the data via CGC co-clustering functionality.

How to cite: Nattino, F., Ku, O., Grootes, M. W., Izquierdo-Verdiguier, E., Girgin, S., and Zurita-Milla, R.: Clustering Geodata Cubes (CGC) and Its Application to Phenological Datasets, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11601, https://doi.org/10.5194/egusphere-egu23-11601, 2023.

EGU23-12773 | PICO | ESSI1.1

Industrial Atmospheric Pollution Estimation Using Gaussian Process Regression

Anton Sokolov, Hervé Delbarre, Daniil Boldyriev, Tetiana Bulana, Bohdan Molodets, and Dmytro Grabovets

Industrial pollution remains a major challenge in spite of recent technological developments and purification procedures. To effectively monitor atmosphere contamination, data from air quality networks should be coupled with advanced spatiotemporal statistical methods.

Our previous studies showed that standard interpolation techniques (like inverse distance weighting, linear or spline interpolation, kernel-based Gaussian Process Regression, GPR) are quite limited for the simulation of a smoke-like narrow-directed industrial pollution in the vicinity of the source (a few tenths of kilometers). In this work, we try to apply GPR, based on statistically estimated covariances. These covariances are calculated using СALPUFF atmospheric pollution dispersion model for a one-year simulation in the Kryvyi Rih region. The application of GPR permits taking into account high correlations between pollution values in neighboring points revealed by modeling. The result of the GPR covariance-based technique is compared with other interpolation techniques. It can be used then in the estimation and optimization of air quality networks.

How to cite: Sokolov, A., Delbarre, H., Boldyriev, D., Bulana, T., Molodets, B., and Grabovets, D.: Industrial Atmospheric Pollution Estimation Using Gaussian Process Regression, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12773, https://doi.org/10.5194/egusphere-egu23-12773, 2023.

EGU23-12933 | ECS | PICO | ESSI1.1

Estimating vegetation carbon stock components by linking ground databases with Earth observations

Daniel Kinalczyk, Christine Wessollek, and Matthias Forkel

Land ecosystems dampen the increase of atmospheric CO₂ by storing carbon in soils and vegetation. In order to estimate how long carbon stays in land ecosystems, a detailed knowledge about the distribution of carbon in different vegetation components is needed. Current Earth observation products provide estimates about total above-ground biomass but do not further separate between carbon stored in trees, understory vegetation, shrubs, grass, litter or woody debris. Here we present an approach in which we link several Earth observation products with a ground-based database to estimate biomass in various vegetation components. Therefore, we use information about the statistical distribution of biomass components provided by the North American Wildland Fuels Database (NAWFD), which are however not available as geocoded data. We use ESA CCI AGB version 3 data from 2010 as a proxy in order to link the NAWFD data to the spatial information from Earth observation products. The biomass and corresponding uncertainty from the ESA CCI AGB and a map of vegetation types are used to select the likely distribution of vegetation biomass components from the set of in-situ measurements of tree biomass. We then apply Isolation Forest outlier detection and bootstrapping for a robust comparison of both datasets and for uncertainty estimation. We use Random Forest and Gaussian Process regression to predict the biomass of trees, shrubs, snags, herbaceous vegetation, coarse and fine woody debris, duff and litter from ESA CCI AGB and land cover, GEDI canopy height, Sentinel-3 LAI and bioclimatic data. The regression models reach high predictive power and allow to also extrapolate to other regions. Our derived estimates of vegetation carbon stock components provide a more detailed view on the land carbon storage and contribute to an improved estimate of potential carbon emissions from respiration, disturbances and fires.

How to cite: Kinalczyk, D., Wessollek, C., and Forkel, M.: Estimating vegetation carbon stock components by linking ground databases with Earth observations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12933, https://doi.org/10.5194/egusphere-egu23-12933, 2023.

EGU23-13196 | ECS | PICO | ESSI1.1

From Super-Resolution to Downscaling - An Image-Inpainting Deep Neural Network for High Resolution Weather and Climate Models

Maximilian Witte, Danai Filippou, Étienne Plésiat, Johannes Meuer, Hannes Thiemann, David Hall, Thomas Ludwig, and Christopher Kadow

High resolution in weather and climate was always a common and ongoing goal of the community. In this regards, machine learning techniques accompanied numerical and statistical methods in recent years. Here we demonstrate that artificial intelligence can skilfully downscale low resolution climate model data when combined with numerical climate model data. We show that recently developed image inpainting technique perform accurate super-resolution via transfer learning using the HighResMIP of CMIP6 (Coupled Model Intercomparison Project Phase 6) experiments. Its huge data base offers a unique training opportunity for machine learning approaches. The transfer learning purpose allows also to downscale other CMIP6 experiments and models, as well as observational data like HadCRUT5. Combined with the technology of Kadow et al. 2020 of infilling missing climate data, we gain a neural network which reconstructs and downscales the important observational data set (IPCC AR6) at the same time. We further investigate the application of our method to downscale quantities predicted from a numerical ocean model (ICON-O) to improve computation times. In this process we focus on the ability of the model to predict eddies from low-resolution data.

An extension to:

Kadow, C., Hall, D.M. & Ulbrich, U. Artificial intelligence reconstructs missing climate information. Nature Geoscience 13, 408–413 (2020). https://doi.org/10.1038/s41561-020-0582-5

How to cite: Witte, M., Filippou, D., Plésiat, É., Meuer, J., Thiemann, H., Hall, D., Ludwig, T., and Kadow, C.: From Super-Resolution to Downscaling - An Image-Inpainting Deep Neural Network for High Resolution Weather and Climate Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13196, https://doi.org/10.5194/egusphere-egu23-13196, 2023.

EGU23-14716 | ECS | PICO | ESSI1.1

Spatial-temporal transferability assessment of remote sensing data models for mapping agricultural land use

Jayan Wijesingha, Ilze Dzene, and Michael Wachendorf

To assess the impact of anthropogenic and natural causes on land use and land use cover change, mapping of spatial and temporal changes is increasingly applied. Due to the availability of satellite image archives, remote sensing (RS) data-based machine learning models are in particular suitable for mapping and analysing land use and land cover changes. Most often, models trained with current RS data are employed to estimate past land cover and land use using available RS data with the assumption that the trained model predicts past data values similar to the accuracy of present data. However, machine learning models trained on RS data from particular locations and times may not be well transferred to new locations and time datasets due to various reasons. This study aims to assess the spatial-temporal transferability of the RS data models in the context of agricultural land use mapping. The study was designed to map agricultural land use (5 classes: maize, grasslands, summer crops, winter crops, and mixed crops) in two regions in Germany (North Hesse and Weser Ems) between the years 2010 and 2018 using Landsat archive data (i.e., Landsat 5, 7, and 8). Three model transferability scenarios were evaluated, a) temporal - S1, b) spatial - S2 and c) spatial-temporal - S3. Two machine learning models (random forest - RF and Convolution Neural Network - CNN) were trained. For each transferability scenario, class-level F1 and macro F1 values were compared between the reference and targeted transferability systems. Moreover, to explain the results of transferability scenarios, transferability results were further explored using dissimilarity index and area of applicability (AOA) concepts. The average macro F1 value of the trained model for the reference scenario (no transferability) was 0.75. For assessed transferability scenarios, the average macro F1 values were 0.70, 0.65 and 0.60, for S1, S2, and S3 respectively. It shows that, when predicting data from different spatial-temporal contexts, the model performance is decreasing. In contrast, the average proportion of the data inside the AOA did not show a clear pattern for different scenarios. In the context of RS data-related model building, spatial-temporal transferability is essential because of the limited availability of the labelled data. Thus, the results from this case study provide an understanding of how model performance changes when the model is transferred to new settings with data from different temporal and spatial domains.

How to cite: Wijesingha, J., Dzene, I., and Wachendorf, M.: Spatial-temporal transferability assessment of remote sensing data models for mapping agricultural land use, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14716, https://doi.org/10.5194/egusphere-egu23-14716, 2023.

EGU23-16096 | ECS | PICO | ESSI1.1

Limitations of machine learning in a spatial context

Jens Heinke, Christoph Müller, and Dieter Gerten

Machine learning algorithms have become popular tools for the analysis of spatial data. However, a number of studies have demonstrated that the application of machine learning algorithms in a spatial context has limitations. New geographic locations may lie outside of the data range for which the model was trained, and estimates of model performance may be too optimistic, when spatial autocorrelation of geographic data is not properly accounted for in cross-validation. We here use artificially created spatial data fields to conduct a series of experiments to further investigate the potential pitfalls of random forest regression applied to spatial data. We provide new insights on previously reported limitations and identify further limitations. We demonstrate that the same mechanism that leads to overoptimistic estimates of model performance (when based on ordinary random k-fold cross-validation) can also lead to a deterioration of model performance. When covariates contain sufficient information to deduce spatial coordinates, the model can reproduce any spatial pattern in the training data even if it is entirely or partly unrelated to the covariates. The presence of spatially correlated residuals in the training data changes how the model utilizes the information of the covariates and impedes the identification of the actual relationship between covariates and response. This reduces model performance when the model is applied to data with a different spatial structure. Under such conditions, machine learning methods that are sufficiently flexible to fit to autocorrelated residuals (such as random forest) may not be an optimal choice. Better models may be obtained using less flexible but more transparent approaches such as generalized linear models or additive models.

How to cite: Heinke, J., Müller, C., and Gerten, D.: Limitations of machine learning in a spatial context, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16096, https://doi.org/10.5194/egusphere-egu23-16096, 2023.

EGU23-16768 | PICO | ESSI1.1

Knowledge Representation of Levee Systems - an Environmental Justice Perspective

Armita Davarpanah, Anthony.l Nguy Robertson, Monica Lipscomb, Jacob.w. McCord, and Amy Morris

Levee systems are designed to reduce the risk of water-related natural hazards (e.g., flooding) in areas behind levees. Most levees in the U.S. are designed to protect people and facilities against the impacts of the 100-year floods. However, the current climate change is increasing the probability of the occurrence of 500-year flood events that in turn increases the likelihood of economic loss, environmental damage, and fatality that disproportionately impacts communities of color and low-income groups facing socio-economic inequities in leveed areas. The increased frequency and intensity of flooding is putting extra pressure on emergency responders that often require diverse, multi-dimensional data originating from different sources to make sound decisions. Currently, the integration of these heterogeneous data acquired by diverse sensors and emergency agencies about environmental, hydrological, and demographic indicators requires costly and complex programming and analysis that hinder rapid disaster management efforts. Our domain ‘Levee System Ontology (LSO)’ resolves the data integration and software interoperability issues by semantically modeling the static aspects, dynamic processes, and information content of the levee systems by extending the well-structured, top-level Basic Formal Ontology (BFO) and mid-level Common Core Ontologies (CCO). LSO’s class and property names follow the terminology of the National Levee Database (NLD), allowing data scientists using NLD data to constrain their classifications based on the knowledge represented in LSO. In addition to modeling the information related to the characteristics and status of the structural components of the levee system, LSO represents the residual risk in leveed areas, economic and environmental losses, and damage to facilities in case of breaching and/or overtopping of levees. LSO enables reasoning to infer components and places along levees and floodwalls where the system requires inspection, maintenance, and repair based on the status of system components. The ontology also represents the impact of flood management activities on different groups of people from an environmental justice perspective, based on the principles of DEI (diversity, equity, inclusion) as defined by the U.N. Sustainable Development Goals.

How to cite: Davarpanah, A., Nguy Robertson, A. L., Lipscomb, M., McCord, J. w., and Morris, A.: Knowledge Representation of Levee Systems - an Environmental Justice Perspective, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16768, https://doi.org/10.5194/egusphere-egu23-16768, 2023.

EGU23-102 | ECS | Orals | ITS1.14/CL5.8

Inferring Causal Structures to Model and Predict ENSO and Its Effect on Asian Summer Monsoon

Shan He, Song Yang, and Dake Chen

Large-scale climate variability is analysed, modelled, and predicted mainly based on general circulation models and low-dimensional association analysis. The models’ equational basis makes it difficult to produce mathematical analysis results and clear interpretations, whereas the association analysis cannot establish causation sufficiently to make invariant predictions. However, the macroscale causal structures of the climate system may accomplish the tasks of analysis, modelling, and prediction according to the concepts of causal emergence and causal prediction’s invariance.

Under the assumptions of no unobserved confounders and linear Gaussian models, we examine whether the macroscale causal structures of the climate system can be inferred not only to model but also to predict the large-scale climate variability. Specifically, first, we obtain the causal structures of the macroscale air-sea interactions of El Niño–Southern Oscillation (ENSO), which are interpretable in terms of physics. The structural causal models constructed accordingly can model the ENSO diversity realistically and predict the ENSO variability. Second, this study identifies the joint effect of ENSO and three other winter climate phenomena on the interannual variability in the East Asian summer monsoon. Using regression, these causal precursors can predict the monsoon one season ahead, outperforming association-based empirical models and several climate models. Third, we introduce a framework that infers ENSO’s air-sea interactions from high-dimensional data sets. The framework is based on aggregating the causal discovery results of bootstrap samples to improve high-dimensional variable selection. It is also based on spatial-dimension reduction to allow of clear interpretations at the macroscale.

While further integration with nonlinear non-Gaussian models will be necessary to establish the full benefits of inferring causal structures as a standard practice in research and operational predictions, our study may offer a route to providing concise explanations of the climate system and reaching accurate invariant predictions.

How to cite: He, S., Yang, S., and Chen, D.: Inferring Causal Structures to Model and Predict ENSO and Its Effect on Asian Summer Monsoon, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-102, https://doi.org/10.5194/egusphere-egu23-102, 2023.

EGU23-239 | ECS | Orals | ITS1.14/CL5.8

Toward a hybrid tropical cyclone global model

Roberto Ingrosso and Mathieu Boudreault

The future evolution of tropical cyclones (TCs) in a warming world is an important issue, considering their potential socio-economic impacts on the areas hit by these phenomena. Previous studies provide robust responses about the future increase in intensity and in the global proportion of major TCs (Category 4–5). On the other hand, high uncertainty is associated to a projected future decrease in global TCs frequency and to potential changes in TC tracks and translation speed.

Risk management and regulatory actions require more robust quantification in how the climate change affects TCs dynamics. A probabilistic hybrid TC model based upon statistical and climate models, physically coherent with TCs dynamics, is being built to investigate the potential impacts of climate change. Here, we provide preliminary results, in terms of present climate reconstruction (1980-2021) and future projections (2022-2060) of cyclogenesis locations and TC tracks, based on different statistical models, such as logistic and multiple linear regressions and random forest. Physical predictors associated with the TC formation and motion and produced by reanalysis (ERA5) and the Community Earth System Model (CESM) ensemble are considered in this study.

How to cite: Ingrosso, R. and Boudreault, M.: Toward a hybrid tropical cyclone global model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-239, https://doi.org/10.5194/egusphere-egu23-239, 2023.

EGU23-492 | ECS | Posters on site | ITS1.14/CL5.8

Separation of climate models and observations based on daily output using two machine learning classifiers

Lukas Brunner, Sebastian Sippel, and Aiko Voigt

Climate models are primary tools to investigate processes in the climate system, to project future changes, and to inform decision makers. The latest generation of models provides increasingly complex and realistic representations of the real climate system while there is also growing awareness that not all models produce equally plausible or independent simulations. Therefore, many recent studies have investigated how models differ from observed climate and how model dependence affects model output similarity, typically drawing on climatological averages over several decades.

Here, we show that temperature maps from individual days from climate models from the CMIP6 archive can be robustly identified as “observation” or “model” even after removing the global mean. An important exception is a prototype high-resolution simulation from the ICON model family that can not be so unambiguously classified into one category. These results highlight that persistent differences between observed and simulated climate emerge at very short time scales already, but very high resolution modelling efforts may be able to overcome some of these shortcomings.

We use two different machine learning classifiers: (1) logistic regression, which allows easy insights into the learned coefficients but has the limitation of being a linear method and (2) a convolutional neural network (CNN) which represents rather the other end of the complexity spectrum, allowing to learn nonlinear spatial relations between features but lacking the easy interpretability logistic regression allows. For CMIP6 both methods perform comparably, while the CNN is also able to recognize about 75% of samples from ICON as coming from a model, while linear regression does not have any skill for this case.

Overall, we demonstrate that the use of machine learning classifiers, once trained, can overcome the need for multiple decades of data to investigate a given model. This opens up novel avenues to test model performance on much shorter times scales.

How to cite: Brunner, L., Sippel, S., and Voigt, A.: Separation of climate models and observations based on daily output using two machine learning classifiers, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-492, https://doi.org/10.5194/egusphere-egu23-492, 2023.

EGU23-753 | ECS | Orals | ITS1.14/CL5.8 | Highlight

Finding regions of similar sea level variability with the help of a Gaussian Mixture Model

Lea Poropat, Céline Heuzé, and Heather Reese

In climate research we often want to focus on a specific region and the most prominent processes affecting it, but how exactly do we select the borders of that region? We also often need to use long-term in situ observations to represent a larger area, but which area exactly are they representative for? In ocean sciences we usually consider basins as separate regions or even simpler, just select a rectangle of the ocean, but that does not always correspond to the real, physically relevant borders. As alternative, we use an unsupervised classification model, Gaussian Mixture Model (GMM), to separate the northwestern European seas into regions based on the sea level variability observed by altimetry satellites.

After performing a principal component (PC) analysis on the 24 years of monthly sea level data, we use the stacked PC maps as input for the GMM. We used the Bayesian Information Criterion to determine into how many regions our area should be split because GMM requires the number of classes to be selected a priori. Depending on the number of PCs used, the optimal number of classes was between 12 and 18, more PCs typically allowing the separation into more regions. Due to the complexity of the data and the dependence of the results on the starting randomly chosen weights, the classification can differ to a degree with every new run of the model, even if we use the exact same data and parameters. To tackle that, instead of using one model, we use an ensemble of models and then determine which class does each grid point belong to by soft voting, i.e., each of the models provides a probability that the point belongs to a particular class and the class with the maximal sum of probabilities wins. As a result, we obtain both the classification and the likelihood of the model belonging to that class.

Despite not using the coordinates of the data points in the model at all, the obtained classes are clearly location dependent, with grid points belonging to the same class always being close to each other. While many classes are defined by bathymetry changes, e.g., the continental shelf break and slope, sometimes other factors come into play, such as for the split of the Norwegian coast into two classes or for the division in the Barents Sea, which is probably based on the circulation. The North Sea is also split into three distinct regions, possibly based on sea level changes caused by dominant wind patterns.

This method can be applied to almost any atmospheric or oceanic variable and used for larger or smaller areas. It is quick and practical, allowing us to delimit the area based on the information we cannot always clearly see from the data, which can facilitate better selection of the regions that need further research.

How to cite: Poropat, L., Heuzé, C., and Reese, H.: Finding regions of similar sea level variability with the help of a Gaussian Mixture Model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-753, https://doi.org/10.5194/egusphere-egu23-753, 2023.

EGU23-849 | ECS | Orals | ITS1.14/CL5.8

Drivers of sea level variability using neural networks

Linn Carlstedt, Lea Poropat, and Céline Heuzé

Understanding the forcing of regional sea level variability is crucial as many people all over the world live along the coasts and are endangered by the sea level rise. The adding of fresh water into the oceans due to melting of the Earth’s land ice together with thermosteric changes has led to a rise of the global mean sea level (GMSL) with an accelerating rate during the twentieth century, and has now reached a mean rate of 3.7 mm per year according to IPCCs latest report. However, this change varies spatially and the dynamics behind what forces sea level variability on a regional to local scale is still less known, thus making it hard for decision makers to mitigate and adapt with appropriate strategies.

Here we present a novel approach using machine learning (ML) to identify the dynamics and determine the most prominent drivers forcing coastal sea level variability. We use a recurrent neural network called Long Short-Term Memory (LSTM) with the advantage of learning data in sequences and thus capable of storing some memory from previous timesteps, which is beneficial when dealing with time series. To train the model we use hourly ERA5 10-m wind, mean sea level pressure (MSLP), sea surface temperature (SST), evaporation and precipitation data between 2009-2017 in the North Sea region. To reduce the dimensionality of the data but still preserve maximal information we conduct principal component analysis (PCA) after removing the climatology which are calculated by hourly means over the years. Depending on the explained variance of the PCs for each driver, 2-4 PCs are chosen and cross-correlated to eliminate collinearity, which could affect the model results. Before being used in the ML model the final preprocessed data are normalized by min-max scaling to optimize the learning. The target data in the model are hourly in-situ sea level observations from West-Terschelling in the Netherlands. Using in-situ observations compared to altimeter data enhances the ability of making good predictions in coastal zones as altimeter data has a tendency to degrade along the coasts. The sea level time series is preprocessed by tidal removal and de-seasoned by subtracting the hourly means. To determine which drivers are most prominent for the sea surface variability in our location, we mute one driver at a time in the training of the network and evaluate the eventual improvement or deterioration of the predictions.

Our results show that the zonal wind is the most prominent forcing of sea level variability in our location, followed by meridional wind and MSLP. While the SST greatly affects the GMSL, SST seems to have little to no effect on local sea level variability compared to other drivers. This approach shows great potential and can easily be applied to any coastal zone and is thus very useful for a broad body of decision makers all over the world. Identifying the cause of local sea level variability will also enable the ability of producing better models for future predictions, which is of great importance and interest.

How to cite: Carlstedt, L., Poropat, L., and Heuzé, C.: Drivers of sea level variability using neural networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-849, https://doi.org/10.5194/egusphere-egu23-849, 2023.

EGU23-984 | ECS | Orals | ITS1.14/CL5.8

Data-driven Attributing of Climate Events with Climate Index Collection based on Model Data (CICMoD)

Marco Landt-Hayen, Willi Rath, Sebastian Wahl, Nils Niebaum, Martin Claus, and Peer Kröger

Machine learning (ML) and in particular artificial neural networks (ANNs) push state-of-the-art solutions for many hard problems e.g., image classification, speech recognition or time series forecasting. In the domain of climate science, ANNs have good prospects to identify causally linked modes of climate variability as key to understand the climate system and to improve the predictive skills of forecast systems. To attribute climate events in a data-driven way with ANNs, we need sufficient training data, which is often limited for real world measurements. The data science community provides standard data sets for many applications. As a new data set, we introduce a collection of climate indices typically used to describe Earth System dynamics. This collection is consistent and comprehensive as we use control simulations from Earth System Models (ESMs) over 1,000 years to derive climate indices. The data set is provided as an open-source framework that can be extended and customized to individual needs. It allows to develop new ML methodologies and to compare results to existing methods and models as benchmark. Exemplary, we use the data set to predict rainfall in the African Sahel region and El Niño Southern Oscillation with various ML models. We argue that this new data set allows to thoroughly explore techniques from the domain of explainable artificial intelligence to have trustworthy models, that are accepted by domain scientists. Our aim is to build a bridge between the data science community and researchers and practitioners from the domain of climate science to jointly improve our understanding of the climate system.

How to cite: Landt-Hayen, M., Rath, W., Wahl, S., Niebaum, N., Claus, M., and Kröger, P.: Data-driven Attributing of Climate Events with Climate Index Collection based on Model Data (CICMoD), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-984, https://doi.org/10.5194/egusphere-egu23-984, 2023.

EGU23-1135 | ECS | Posters on site | ITS1.14/CL5.8

Curation of High-level Molecular Atmospheric Data for Machine Learning Purposes

Vitus Besel, Milica Todorović, Theo Kurtén, Patrick Rinke, and Hanna Vehkamäki

As cloud and aerosol interactions remain large uncertainties in current climate models (IPCC) they are of special interest for atmospheric science. It is estimated that more than 70% of all cloud condensation nuclei origin from so-called New Particle Formation, which is the process of gaseous precursors clustering together in the atmosphere and subsequent growth into particles and aerosols. After initial clustering this growth is driven strongly by condensation of low volatile organic compounds (LVOC), that is molecules with saturation vapor pressures (p_Sat) below 10^-6 mbar [1]. These origin from organic molecules emitted by vegetation that are subsequently rapidly oxidized in the air, so-called Biogenic LVOC (BLVOC).

We have created a big data set of BLVOC using high-throughput computing and Density Functional Theory (DFT), and use it to train Machine Learning models to predict p_Sat of previously unseen BLVOC. Figure 1 illustrates some sample molecules form the data.

Figure 1: Sample molecules, for small, medium large sizes. Figure 2: Histogram of the calculated saturation vapor pressures.

Initially the chemical mechanism GECKO-A provides possible BLVOC molecules in the form of SMILES strings. In a first step the COSMOconf program finds and optimizes structures of possible conformers and provides their energies for the liquid phase on a DFT level of theory. After an additional calculation of the gas phase energies with Turbomole, COSMOtherm calculates thermodynamical properties, such as the p_Sat, using the COSMO-RS [1] model. We compressed all these computations together in a highly parallelised high-throughput workflow to calculate 32k BLVOC, that include over 7 Mio. molecular conformers. See a histogram of the calculated p_Satin Figure 2.

We use the calculated pSat to train a Gaussian Process Regression (GPR) machine learning model with the Topological Fingerprint as descriptor for molecular structures. The GPR incorporates noise and outputs uncertainties for predictions on the p_Sat. These uncertainties and data cluster techniques allow for the active choosing of molecules to include in the training data, so-called Active Learning. Further, we explore using SLISEMAP [2] explainable AI methods to correlate Machine Learning predictions, the high-dimensional descriptors and human-readable properties, such as functional groups.

[1] Metzger, A. et al. Evidence for the role of organics in aerosol particle formation under atmospheric conditions. Proc. Natl. Acad. Sci. 107, 6646–6651, 10.1073/pnas.0911330107 (2010)
[2] Klamt, A. & Schüürmann, G. Cosmo: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. J. Chem. Soc., Perkin Trans. 2 799–805, 10.1039/P29930000799 (1993).
[3] Björklund, A., Mäkelä, J. & Puolamäki, K. SLISEMAP: supervised dimensionality reduction through local explanations. Mach Learn (2022). https://doi.org/10.1007/s10994-022-06261-1

How to cite: Besel, V., Todorović, M., Kurtén, T., Rinke, P., and Vehkamäki, H.: Curation of High-level Molecular Atmospheric Data for Machine Learning Purposes, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1135, https://doi.org/10.5194/egusphere-egu23-1135, 2023.

EGU23-1244 | Posters on site | ITS1.14/CL5.8

Machine learning for non-orographic gravity waves in a climate model

Steven Hardiman, Adam Scaife, Annelize van Niekerk, Rachel Prudden, Aled Owen, Samantha Adams, Tom Dunstan, Nick Dunstone, and Melissa Seabrook

There is growing use of machine learning algorithms to replicate sub-grid parametrisation schemes in global climate models. Parametrisations rely on approximations, thus there is potential for machine learning to aid improvements. In this study, a neural network is used to mimic the behaviour of the non-orographic gravity wave scheme used in the Met Office climate model, important for stratospheric climate and variability. The neural network is found to require only two of the six inputs used by the parametrisation scheme, suggesting the potential for greater efficiency in this scheme. Use of a one-dimensional mechanistic model is advocated, allowing neural network hyperparameters to be trained based on emergent features of the coupled system with minimal computational cost, and providing a test bed prior to coupling to a climate model. A climate model simulation, using the neural network in place of the existing parametrisation scheme, is found to accurately generate a quasi-biennial oscillation of the tropical stratospheric winds, and correctly simulate the non-orographic gravity wave variability associated with the El Nino Southern Oscillation and stratospheric polar vortex variability. These internal sources of variability are essential for providing seasonal forecast skill, and the gravity wave forcing associated with them is reproduced without explicit training for these patterns.

How to cite: Hardiman, S., Scaife, A., van Niekerk, A., Prudden, R., Owen, A., Adams, S., Dunstan, T., Dunstone, N., and Seabrook, M.: Machine learning for non-orographic gravity waves in a climate model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1244, https://doi.org/10.5194/egusphere-egu23-1244, 2023.

EGU23-1502 | ECS | Orals | ITS1.14/CL5.8

Adapting Transfer Learning for Multiple Channels in Satellite Data Applications

Naomi Simumba and Michiaki Tatsubori

Transfer learning is a technique wherein information learned by previously trained models is applied to new learning tasks. Typically, weights learned by a network pretrained on other datasets are copied or transferred to new networks. These new networks, or downstream models, are then are then used for assorted tasks. Foundation models extend this concept by training models on large datasets. Such models gain a contextual understanding which can then be used to improve performance of downstream tasks in different domains. Common examples include GPT-3 in the field on natural language processing and ImageNet trained models in the field of computer vision.

Beyond its high rate of data collection, satellite data also has a wide range of meaningful applications including climate impact modelling and sustainable energy. This makes foundation models trained on satellite data very beneficial as they would reduce the time, data, and computational resources required to obtain useful downstream models for these applications.

However, satellite data models differ from typical computer vision models in a crucial way. Because several types of satellite data exist, each with its own benefits, a typical use case for satellite data involves combining multiple data inputs in configurations that are not readily apparent during pretraining of the foundation model. Essentially, this means that the downstream application may have a different number of input channels from the pretrained model, which raises the question of how to successfully transfer information learned by the pretrained model to the downstream application.

This research proposes and examines several architectures for the downstream model that allow for pretrained weights to be incorporated when a different number of input channels is required. For evaluation, models pretrained with self-supervised learning on precipitation data are applied to a downstream model which conducts temporal interpolation of precipitation data and requires two inputs. The effect of including perceptual loss to enhance model performance is also evaluated. These findings can be used to guide adaptation for applications ranging from flood modeling, land use detection, and more.

How to cite: Simumba, N. and Tatsubori, M.: Adapting Transfer Learning for Multiple Channels in Satellite Data Applications, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1502, https://doi.org/10.5194/egusphere-egu23-1502, 2023.

EGU23-1855 | ECS | Posters on site | ITS1.14/CL5.8

Multi-Temporal Downscaling of Streamflow for Ungauged Stations/ Sub-Basins from Daily to Sub-Daily Interval Using Hybrid Framework – A Case Study on Flash Flood Watershed

Venkatesh Budamala, Abhinav Wadhwa, and Rajarshi Das Bhowmik

Unprecedented flash floods (FF) in urban regions are increasing due to heavy rainfall intensity and magnitude as a result of human-induced climate and land-use changes. The changes in weather patterns and various anthropogenic activities increase the complexity of modelling the FF at different spatiotemporal scales: which indicates the importance of multi-resolution forcing information. Towards this, developing new methods for processing coarser resolution spatio-temporal datasets are essential for the efficient modelling of FF. While a wide range of methods is available for spatial and temporal downscaling of the climate data, the multi-temporal downscaling strategy has not been investigated for ungauged stations of streamflow. The current study proposed a multi-temporal downscaling (MTD) methodology for gauged and ungauged stations using Adaptive Emulator Modelling concepts for daily to sub-daily streamflows. The proposed MTD framework for ungauged stations comprise a hybrid framework with conceptual and machine learning-based approaches to analyze the catchment behavior and downscale the model outputs from daily to sub-daily scales. The study area, Peachtree Creek watershed (USA), frequently experiences flash floods; hence, selected to validate the proposed framework. Further, the study addresses the critical issues of model development, seasonality, and diurnal variation of MTD data. The study obtained MTD data with minimal uncertainty on capturing the hydrological signatures and nearly 95% of accuracy in predicting the flow attributes over ungauged stations. The proposed framework can be highly useful for short- and long-range planning, management, and mitigation measurements, where the absence of fine resolution data prohibits flash flood modeling.

How to cite: Budamala, V., Wadhwa, A., and Bhowmik, R. D.: Multi-Temporal Downscaling of Streamflow for Ungauged Stations/ Sub-Basins from Daily to Sub-Daily Interval Using Hybrid Framework – A Case Study on Flash Flood Watershed, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1855, https://doi.org/10.5194/egusphere-egu23-1855, 2023.

EGU23-2289 | ECS | Posters on site | ITS1.14/CL5.8

Towards understanding the effect of parametric aerosol uncertainty on climate using a chemical transport model perturbed parameter ensemble.

Meryem Bouchahmoud, Tommi Bergman, and Christina Williamson

Aerosols in the climate system have a direct link to the Earth’s energy balance. Aerosols interact directly with the solar radiation through scattering and absorption; and indirectly by changing cloud properties. The effect aerosols have on climate is one of the major causes of radiative forcing (RF) uncertainty in global climate model simulations. Thus, reducing aerosol RF uncertainty is key to improving climate prediction. The objective of this work is to understand the magnitude and causes of aerosol uncertainty in the chemical transport model TM5.

Perturbed Parameter Ensembles (PPEs) are a set of model runs created by perturbing an ensemble of parameters. Parameters are model inputs, for this study we focus on parameters describing aerosol emissions, properties and processes, such as dry deposition, aging rate, emissions to aerosols microphysics. PPEs vary theses parameters over their uncertainty range all at once to study their combine effect on TM5.

Varying these parameters along with others through their value range, will reflect on TM5 outputs. The TM5 outputs parameters we are using in our sensitivity study are the cloud droplet number concentration and the ambient aerosol absorption optical thickness at 550nm.

Here we discuss the design of the PPE, and one-at-a-time sensitivity studies used in this process. The PPE samples the parameter space to enable us to use emulation. Emulating is a machine learning technique that uses a statistical surrogate model to replace the chemical transport model. The aim is to provide output data with more dense sampling throughout the parameter space. We will be using a Gaussian process emulator, which has been shown to be an efficient technique for quantifying parameter sensitivity in complex global atmospheric models.

We also describe plans to extend this work to emulate an aerosol PPE for EC-Earth. The PPE for EC-Earth will also contain cloud parameters that will vary over their uncertainty range together with the aerosol parameters to examine the influence of aerosol parametric uncertainty on RF.

How to cite: Bouchahmoud, M., Bergman, T., and Williamson, C.: Towards understanding the effect of parametric aerosol uncertainty on climate using a chemical transport model perturbed parameter ensemble., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2289, https://doi.org/10.5194/egusphere-egu23-2289, 2023.

EGU23-2541 | ECS | Posters on site | ITS1.14/CL5.8

Machine learning based automated parameter tuning of ICON-A using satellite data

Pauline Bonnet, Fernando Iglesias-Suarez, Pierre Gentine, Marco Giorgetta, and Veronika Eyring

Global climate models use parameterizations to represent the effect of subgrid scale processes on the resolved state. Parameterizations in the atmosphere component usually include radiation, convection, cloud microphysics, cloud cover, gravity wave drag, vertical turbulence in the boundary layer and other processes. Parameterizations are semi-empirical functions that include a number of tunable parameters. Because these parameters are loosely constraint with experimental data, a range of values are typically explored by evaluating model runs against observations and/or high resolution runs. Fine tuning a climate model is a complex inverse problem due to the number of tunable parameters and observed climate properties to fit. Moreover, parameterizations are sources of uncertainties for climate projections, thus fine tuning is a crucial step in model development.

Traditionally, tuning is a time-consuming task done manually, by iteratively updating the values of the parameters in order to investigate the parameter space with user-experience driven choices. To overcome such limitation and search efficiently through the parameter space one can implement automatic techniques. Typical steps in automatic tuning are: (i) constraining the scope of the study (model, simulation setup, parameters, metrics to fit and corresponding reference values); (ii) conducting a sensitivity analysis to reduce the parameter space and/or building an emulator for the climate model; and (iii) conducting a sophisticated grid search to define the optimum parameter set or its distribution (e.g., rejection sampling and history matching). The ICOsahedral Non-hydrostatic (ICON) model is a modelling framework for numerical weather prediction and climate projections. We implement a ML-based automatic tuning technic to tune a recent version of ICON-A with a spatial resolution typically used for climate projections. We evaluate the tuned ICON-A model against satellite observations using the Earth System Model Evaluation Tool (ESMValTool). Although automatic tuning technics allow to reach the optimum parameter values in less steps than with the manual tuning, they still require some experience-driven choices throughout the tuning process. Moreover, the performances of the tuned model is limited by the structural errors of the model, inherent to the mathematical description of the parameterizations included in the model.

How to cite: Bonnet, P., Iglesias-Suarez, F., Gentine, P., Giorgetta, M., and Eyring, V.: Machine learning based automated parameter tuning of ICON-A using satellite data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2541, https://doi.org/10.5194/egusphere-egu23-2541, 2023.

EGU23-3404 | ECS | Posters on site | ITS1.14/CL5.8 | Highlight

Deep learning-based generation of 3D cloud structures from geostationary satellite data

Sarah Brüning, Stefan Niebler, and Holger Tost

Clouds and their interdependent feedback mechanisms remain a source of insecurity in climate science. This said, overcoming relating obstacles especially in the context of a changing climate emphasizes the need for a reliable database today more than ever. While passive remote sensing sensors provide continuous observations of the cloud top, they lack vital information on subjacent levels. Here, active instruments can deliver valuable insights to fill this gap in knowledge.

This study sets on to combine the benefits of both instrument types. It aims (1) to reconstruct the vertical distribution of volumetric radar data along the cloud column and (2) to interpolate the resultant 3D cloud structure to the satellite’s full disk by applying a contemporary Deep-Learning approach. Input data was derived by an automated spatio-temporally matching between high-resoluted satellite channels and the overflight of the radar. These samples display the physical predictors that were fed into the network to reconstruct the cloud vertical distribution on each of the radar’s height levels along the whole domain. Data from the entire year 2017 was used to integrate seasonal variations into the modeling routine.

The results demonstrate not only the network’s ability to reconstruct the cloud column along the radar track but also to interpolate coherent structures into a large-scale perspective. While the model performs equally well over land and water bodies, its applicable time frame is limited to daytime predictions only. Finally, the generated data can be leveraged to build a comprehensive database of 3D cloud structures that is to be exploited in proceeding applications.

How to cite: Brüning, S., Niebler, S., and Tost, H.: Deep learning-based generation of 3D cloud structures from geostationary satellite data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3404, https://doi.org/10.5194/egusphere-egu23-3404, 2023.

EGU23-3418 | ECS | Posters on site | ITS1.14/CL5.8

Building a physics-constrained, fast and stable machine learning-based radiation emulator

Guillaume Bertoli, Sebastian Schemm, Firat Ozdemir, Fernando Perez Cruz, and Eniko Szekely

Modelling the transfer of radiation through the atmosphere is a key component of weather and climate models. The operational radiation scheme in the Icosahedral Nonhydrostatic Weather and Climate Model (ICON) is ecRad. The radiation scheme ecRad is accurate but computationally expensive. It is operationally run in ICON on a grid coarser than the dynamical grid and the time step interval between two calls is significantly larger. This is known to reduce the quality of the climate prediction. A possible approach to accelerate the computation of the radiation fluxes is to use machine learning methods. Machine learning methods can significantly speed up computation of radiation, but they may cause climate drifts if they do not respect essential physical laws. In this work, we study random forest and neural network emulations of ecRad. We study different strategies to compare the stability of the emulations. Concerning the neural network, we compare loss functions with an additional energy penalty term and we observe that modifying the loss function is essential to predict accurately the heating rates. The random forest emulator, which is significantly faster to train than the neural network is used as a reference model that the neural network must outperform. The random forest emulator can become extremely accurate but the memory requirement quickly become prohibitive. Various numerical experiments are performed to illustrate the properties of the machine learning emulators.

How to cite: Bertoli, G., Schemm, S., Ozdemir, F., Perez Cruz, F., and Szekely, E.: Building a physics-constrained, fast and stable machine learning-based radiation emulator, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3418, https://doi.org/10.5194/egusphere-egu23-3418, 2023.

EGU23-3457 | Orals | ITS1.14/CL5.8

Evaluating Vegetation Modelling in Earth System Models with Machine Learning Approaches

Ranjini Swaminathan, Tristan Quaife, and Richard Allan

The presence and amount of vegetation in any given region controls Gross Primary Production (GPP) or the flux of carbon into the land driven by the process of photosynthesis. Earth System Models (ESMs) give us the ability to simulate GPP through modelling the various interactions between the atmosphere and biosphere including under different climate change scenarios in the future. GPP is the largest flux of the global carbon cycle and plays an important role including in carbon budget calculations. However, GPP estimates from ESMs not only vary widely but also have much uncertainty in the underpinning attributors for this variability.

We use data from pre-industrial Control (pi-Control) simulations to avail of the longer time period to sample data from as well as to exclude the influence of anthropogenic forcing in GPP estimation thereby leaving GPP to be largely attributable to two factor - (a) input atmospheric forcings and (b) the processes using those input climate variables to diagnose GPP.

We explore the processes determining GPP with a physically-guided Machine Learning framework applied to a set of Earth System Models (ESMs) from the Sixth Coupled Model Intercomparison Project (CMIP6). We use this framework to examine whether differences in GPP across models are caused by differences in atmospheric state or process representations.

Results from our analysis show that models with similar regional atmospheric forcing do not always have similar GPP distributions. While there are regions where climate models largely agree on what atmospheric variables are most relevant for GPP, there are regions such as the tropics where there is more uncertainty. Our analysis highlights the potential of ML to identify differences in atmospheric forcing and carbon cycle process modelling across current state-of-the-art ESMs. It also allows us to extend the analysis with observational estimates of forcings as well as GPP for model improvement.

How to cite: Swaminathan, R., Quaife, T., and Allan, R.: Evaluating Vegetation Modelling in Earth System Models with Machine Learning Approaches, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3457, https://doi.org/10.5194/egusphere-egu23-3457, 2023.

EGU23-3619 | ECS | Posters on site | ITS1.14/CL5.8

TCDetect: A new method of Detecting the Presence of Tropical Cyclones using Deep Learning

Daniel Galea, Julian Kunkel, and Bryan Lawrence

Tropical cyclones are high-impact weather events which have large human and economic effects, so it is important to be able to understand how their location, frequency and structure might change in a future climate.

Here, a lightweight deep learning model is presented which is intended for detecting the presence of tropical cyclones during the execution of numerical simulations for use in an online data reduction method. This will help to avoid saving vast amounts of data for analysis after the simulation is complete. With run-time detection, it might be possible to reduce the need for some of the high-frequency high-resolution output which would otherwise be required.

The model was trained on ERA-Interim reanalysis data from 1979 to 2017 and the training concentrated on delivering the highest possible recall rate (successful detection of cyclones) while rejecting enough data to make a difference in outputs.

When tested using data from the two subsequent years, the recall or probability of detection rate was 92%. The precision rate or success ratio obtained was that of 36%. For the desired data reduction application, if the desired target included all tropical cyclone events, even those which did not obtain hurricane-strength status, the effective precision was 85%.

The recall rate and the Area Under Curve for the Precision/Recall (AUC-PR) compare favourably with other methods of cyclone identification while using the smallest number of parameters for both training and inference.

Work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-843612

How to cite: Galea, D., Kunkel, J., and Lawrence, B.: TCDetect: A new method of Detecting the Presence of Tropical Cyclones using Deep Learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3619, https://doi.org/10.5194/egusphere-egu23-3619, 2023.

EGU23-3875 | ECS | Posters on site | ITS1.14/CL5.8

Explainable AI for oceanic carbon cycle analysis of CMIP6

Paul Heubel, Lydia Keppler, and Tatiana Iliyna

The Southern Ocean acts as one of Earth's major carbon sinks, taking up anthropogenic carbon from the atmosphere. Earth System Models (ESMs) are used to project its future evolution. However, the ESMs in the Coupled Model Intercomparison Project version 6 (CMIP6) disagree on the biogeochemical representation of the Southern Ocean carbon cycle, both with respect to the phasing and the magnitude of the seasonal cycle of dissolved inorganic carbon (DIC), and they compare poorly with observations.

We develop a framework to investigate model biases in 10 CMIP6 ESMs historical runs incorporating explainable artificial intelligence (xAI) methodologies. Using both a linear Random Forest feature relevance approach to a nonlinear self organizing map - feed forward neural network (SOM-FFN) framework, we relate 5 drivers of the seasonal cycle of DIC in the Southern Ocean in the different CMIP6 models. We investigate temperature, salinity, silicate, nitrate and dissolved oxygen as potential drivers. This analysis allows us to determine dominant statistical drivers of the seasonal cycle of DIC in the different models, and how they compare to the observations. Our findings inform future model development to better constrain the seasonal cycle of DIC.

How to cite: Heubel, P., Keppler, L., and Iliyna, T.: Explainable AI for oceanic carbon cycle analysis of CMIP6, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3875, https://doi.org/10.5194/egusphere-egu23-3875, 2023.

EGU23-4044 | ECS | Orals | ITS1.14/CL5.8

DailyMelt: Diffusion-based Models for Spatiotemporal Downscaling of (Ant-)arctic Surface Meltwater Maps

Björn Lütjens, Patrick Alexander, Raf Antwerpen, Guido Cervone, Matthew Kearney, Bingkun Luo, Dava Newman, and Marco Tedesco

Motivation. Ice melting in Greenland and Antarctica has increasingly contributed to rising sea levels. Yet, the exact speed of melting, existence of abrupt tipping points, and in-detail links to climate change remain uncertain. Ice shelves essentially prevent the ice sheet from slipping into the ocean and better prediction of collapses is needed. Meltwater at the surface of ice shelves indicates ice shelf collapse through destabilizing ice shelves via fracturing and flexural processes (Banwell et al., 2013) and is likely impacted by a warming climate ( Kingslake et al., 2017). Maps of meltwater have been created from in-situ and remote observations, but their low and irregular spatiotemporal resolution severely limits studies (Kingslake et al., 2019).

Research Gap. In particular, there does not exist daily high-resolution (< 500m) maps of surface meltwater. We propose the first daily high-resolution surface meltwater maps by developing a deep learning-based downscaling method, called DailyMelt, that fuses observations and simulations of varying spatiotemporal resolution, as illustrated in Fig.1. The created maps will improve understanding of the origin, transport, and controlling physical processes of surface meltwater. Moreover, they will act as unified source to improve sea level rise and meltwater predictions in climate models.

Data. To synthesize surface meltwater maps, we leverage observations from satellites (MODIS, Sen-1 SAR) which are high-resolution (500m, 10m), but have substantial temporal gaps due to repeat time and cloud coverage. We fuse them with simulations (MAR) and passive microwave observations (MEaSURE) that are daily, but low-resolution (6km, 3.125km). In a significant remote sensing effort, we have downloaded, reprojected, and regridded all products into daily observations for our study area over Greenland’s Helheim glacier.

Approach and expected results. Within deep generative vision models, diffusion-based models promise sharp and probabilistic predictions. We have implemented SRDiff (Li H. et al., 2022) and tested it on spatially downscaling external data. As a baseline model, we have implemented a statistical downscaling model that is a local hybrid physics-linear regression model (Noel et al., 2016). In our planned benchmark, we expect a baseline UNet architecture that minimizes RMSE to create blurry maps and a generative adversarial network that minimizes adversarial loss to create sharp but deterministic maps. We have started with spatial downscaling and will include temporal downscaling.

In summary, we will create the first daily high-resolution (500m) surface meltwater maps, have introduced the first diffusion-based model for downscaling Earth sciences data, and have created the first benchmark dataset for downscaling surface meltwater maps.

References.

Banwell, A. F., et al. (2013), Breakup of the Larsen B Ice Shelf triggered by chain reaction drainage of supraglacial lakes, Geophys. Res. Lett., 40

Kingslake J, et al. (2017), Widespread movement of meltwater onto and across Antarctic ice shelves, Nature, 544(7650)

Kingslake J., et al. (2019), Antarctic Surface Hydrology and Ice Shelf Stability Workshop report, US Antarctic Program Data Center

Li H., et al. (2022), SRDiff: Single image super-resolution with diffusion probabilistic models, Neurocomputing, 479

Noël, B., et al. (2016), A daily, 1 km resolution data set of downscaled Greenland ice sheet surface mass balance (1958–2015), The Cryosphere, 10

How to cite: Lütjens, B., Alexander, P., Antwerpen, R., Cervone, G., Kearney, M., Luo, B., Newman, D., and Tedesco, M.: DailyMelt: Diffusion-based Models for Spatiotemporal Downscaling of (Ant-)arctic Surface Meltwater Maps, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4044, https://doi.org/10.5194/egusphere-egu23-4044, 2023.

EGU23-4350 | ECS | Orals | ITS1.14/CL5.8

Physics-Constrained Deep Learning for Downscaling

Paula Harder, Venkatesh Ramesh, Alex Hernandez-Garcia, Qidong Yang, Prasanna Sattigeri, Daniela Szwarcman, Campbell Watson, and David Rolnick

The availability of reliable, high-resolution climate and weather data is important to inform long-term decisions on climate adaptation and mitigation and to guide rapid responses to extreme events. Forecasting models are limited by computational costs and, therefore, often generate coarse-resolution predictions. Statistical downscaling can provide an efficient method of upsampling low-resolution data. In this field, deep learning has been applied successfully, often using image super-resolution methods from computer vision. However, despite achieving visually compelling results in some cases, such models frequently violate conservation laws when predicting physical variables. In order to conserve physical quantities, we develop methods that guarantee physical constraints are satisfied by a deep learning downscaling model while also improving their performance according to traditional metrics. We compare different constraining approaches and demonstrate their applicability across different neural architectures as well as a variety of climate and weather data sets, including ERA5 and WRF data sets.

How to cite: Harder, P., Ramesh, V., Hernandez-Garcia, A., Yang, Q., Sattigeri, P., Szwarcman, D., Watson, C., and Rolnick, D.: Physics-Constrained Deep Learning for Downscaling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4350, https://doi.org/10.5194/egusphere-egu23-4350, 2023.

EGU23-5431 | ECS | Orals | ITS1.14/CL5.8

Towards Robust Parameterizations in Ecosystem-level Photosynthesis Models

Shanning Bao, Nuno Carvalhais, Lazaro Alonso, Siyuan Wang, Johannes Gensheimer, Ranit De, and Jiancheng Shi

Photosynthesis model parameters represent vegetation properties or sensitivities of photosynthesis processes. As one of the model uncertainty sources, parameters affect the accuracy and generalizability of the model. Ideally, parameters of ecosystem-level photosynthesis models, i.e., gross primary productivity (GPP) models, can be measured or inversed from observations at the local scale. To extrapolate parameters to a larger spatial scale, current photosynthesis models typically adopted fixed values or plant-functional-type(PFT)-specific values. However, the fixed and PFT-based parameterization approaches cannot capture sufficiently the spatial variability of parameters and lead to significant estimation errors. Here, we propose a Simultaneous Parameter Inversion and Extrapolation approach (SPIE) to overcome these issues.

SPIE refers to predicting model parameters using an artificial neural network (NN) constrained by both model loss and ecosystem features including PFT, climate types, bioclimatic variables, vegetation features, atmospheric nitrogen and phosphorus deposition and soil properties. Taking a light use efficiency (LUE) model as an example, we evaluated SPIE at 196 FLUXNET eddy covariance flux sites. The LUE model accounts for the effects of air temperature, vapor pressure deficit, soil water availability (SW), light saturation, diffuse radiation fraction and CO₂ on GPP using five independent sensitivity functions. The SW was represented using the water availability index and can be optimized based on evapotranspiration. Thus, we optimized the NN by minimizing the model loss which consists of GPP errors, evapotranspiration errors, and constraints on sensitivity functions. Furthermore, we compared SPIE with 11 typical parameter extrapolating approaches, including PFT- and climate-specific parameterizations, global and PFT-based parameter optimization, site-similarity, and regression methods using Nash-Sutcliffe model efficiency (NSE), determination coefficient (R²) and normalized root mean squared error (NRMSE).

The results in ten-fold cross-validation showed that SPIE had the best performance across various temporal and spatial scales and across assessing metrics. None of the parameter extrapolating approaches reached the same performance as the on-site calibrated parameters (NSE=0.95), but SPIE was the only approach showing positive NSE (=0.68) in cross-validation across sites. Moreover, the site-level NSE, R², and NRMSE of SPIE all significantly outperformed per biome and per climate type. Ranges of parameters were more constrained by SPIE than site calibrations.

Overall, SPIE is a robust parameter extrapolation approach that overcomes strong limitations observed in many of the standard model parameterization approaches. Our approach suggests that model parameterizations can be determined from observations of vegetation, climate and soil properties, and expands from customary clustering methods (e.g., PFT-specific parameterization). We argue that expanding SPIE to other models overcomes current limits in parameterization and serves as an entry point to investigate the robustness and generalization of different models.

How to cite: Bao, S., Carvalhais, N., Alonso, L., Wang, S., Gensheimer, J., De, R., and Shi, J.: Towards Robust Parameterizations in Ecosystem-level Photosynthesis Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5431, https://doi.org/10.5194/egusphere-egu23-5431, 2023.

EGU23-5487 * | ECS | Posters on site | ITS1.14/CL5.8 | Highlight

Harvesting historical spy imagery by evaluating deep learning models for state-wide mapping of land cover changes between 1965-1978

Lucas Kugler, Christopher Marrs, Eric Kosczor, and Matthias Forkel

Remote sensing has played a fundamental role for land cover mapping and change detection at least since the launch of the Landsat satellite program in 1972. In 1995, the Central Intelligence Agency of the United States of America released previously classified spy imagery taken from 1960 onwards with near-global coverage from the Keyhole programme, which includes the CORONA satellite mission. CORONA imagery is a treasure because it contains information about land cover 10 years before the beginning of the civilian Earth observation and has a high spatial resolution < 2m. However, this imagery is only pan-chromatic and usually not georeferenced, which has so far prevented a large-scale application for land cover mapping or other geophysical and environmental applications.

Here, we aim to harvest the valuable information about past land cover from CORONA imagery for a state-wide mapping of past land cover changes between 1965 and 1978 by training, testing and validating various deep learning models.

To the best of our knowledge, this is the first work to analyse land cover from CORONA data on a large scale, dividing land cover into six classes based on the CORINE classification scheme. The particular focus of the work is to test the transferability of the deep learning approaches to unknown CORONA data.

To investigate the transferability, we selected 27 spatially and temporally distributed study areas (each 23 km²) in the Free State of Saxony (Germany) and created semantic masks to train and test 10 different U-shaped neuronal network architectures to extract land cover from CORONA data. As input, we use either the original panchromatic pixel values and different texture measures. From these input data, ten different training datasets and test datasets were derived for cross-validation.

The training results show that a semantic segmentation of land cover from CORONA data with the used architectures is possible. Strong differences in model performance (based on cross validation and the intersection over union metric, IOU) were detected among the classes. Classes with many sample data achieve significantly better IOU values than underrepresented classes. In general, a U-shaped architecture with a Transformer as Encoder (Transformer U-Net) achieved the best results. The best segmentation performance (IOU 83.29%), was obtained for forests, followed by agriculture (74.21%). For artificial surfaces, a mean IOU of 68.83% was achieved. Water surfaces achieved a mean IOU of 66.49%. For the shrub vegetation and open areas classes only IOU values mostly below 25% were achieved. The deep learning models were successfully transferable in space (between test areas) and time (between CORONA imagery from different years) especially for classes with many sample data. The transferability of deep learning models was difficult for the mapping of water bodies. Despite the general good model performance and successful transferability for most classes, the transferability was limited especially for imagery of very poor quality. Our approach enabled the state-wide mapping of land cover in Saxony between 1965 and 1978 with a spatial resolution of 2 m. We identify an increase in urban cover and a decrease in cropland cover

How to cite: Kugler, L., Marrs, C., Kosczor, E., and Forkel, M.: Harvesting historical spy imagery by evaluating deep learning models for state-wide mapping of land cover changes between 1965-1978, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5487, https://doi.org/10.5194/egusphere-egu23-5487, 2023.

EGU23-5583 | ECS | Posters on site | ITS1.14/CL5.8

Identifying and Locating Volcanic Eruptions using Convolutional Neural Networks and Interpretability Techniques

Johannes Meuer, Claudia Timmreck, Shih-Wei Fang, and Christopher Kadow

Accurately interpreting past climate variability can be a challenging task, particularly when it comes to distinguishing between forced and unforced changes. In the case of large volcanic eruptions, ice core records are a very valuable tool but still often not sufficient to link reconstructed anomaly patterns to a volcanic eruption at all or to its geographical location. In this study, we developed a convolutional neural network (CNN) that is able to classify whether a volcanic eruption occurred and its location (northern hemisphere extratropical, southern hemisphere extratropical, or tropics) with an accuracy of 92%.

To train the CNN, we used 100 member ensembles of the MPI-ESM-LR global climate model, generated using the easy volcanic aerosol (EVA) model, which provides the radiative forcing of idealized volcanic eruptions of different strengths and locations. The model considered global sea surface temperature and precipitation patterns 12 months after the eruption over a time period of 3 months.

In addition to demonstrating the high accuracy of the CNN, we also applied layer-wise relevance propagation (LRP) to the model to understand its decision-making process and identify the input data that influenced its predictions. Our study demonstrates the potential of using CNNs and interpretability techniques for identifying and locating past volcanic eruptions as well as improving the accuracy and understanding of volcanic climate signals.

How to cite: Meuer, J., Timmreck, C., Fang, S.-W., and Kadow, C.: Identifying and Locating Volcanic Eruptions using Convolutional Neural Networks and Interpretability Techniques, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5583, https://doi.org/10.5194/egusphere-egu23-5583, 2023.

EGU23-5967 | ECS | Posters on site | ITS1.14/CL5.8

Potentials and challenges of using Explainable AI for understanding atmospheric circulation

Sebastian Scher, Andreas Trügler, and Jakob Abermann

Machine Learning (ML) and AI techniques, especially methods based on Deep Learning, have long been considered as black boxes that might be good at predicting, but not explaining predictions. This has changed recently, with more techniques becoming available that explain predictions by ML models – known as Explainable AI (XAI). These have seen adaptation also in climate science, because they could have the potential to help us in understanding the physics behind phenomena in geoscience. It is, however, unclear, how large that potential really is, and how these methods can be incorporated into the scientific process. In our study, we use the exemplary research question of which aspects of the large-scale atmospheric circulation affect specific local conditions. We compare the different answers to this question obtained with a range of different methods, from the traditional approach of targeted data analysis based on physical knowledge (such as using dimensionality reduction based on physical reasoning) to purely data-driven and physics-unaware methods using Deep Learning with XAI techniques. Based on these insights, we discuss the usefulness and potential pitfalls of XAI for understanding and explaining phenomena in geosciences.

How to cite: Scher, S., Trügler, A., and Abermann, J.: Potentials and challenges of using Explainable AI for understanding atmospheric circulation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5967, https://doi.org/10.5194/egusphere-egu23-5967, 2023.

EGU23-6061 | ECS | Orals | ITS1.14/CL5.8 | Highlight

Using reduced representations of atmospheric fields to quantify the causal drivers of air pollution

Sebastian Hickman, Paul Griffiths, Peer Nowack, and Alex Archibald

Air pollution contributes to millions of deaths worldwide every year. The concentration of a particular air pollutant, such as ozone, is controlled by physical and chemical processes which act on varying temporal and spatial scales. Quantifying the strength of causal drivers (e.g. temperature) on air pollution from observational data, particularly at extrema, is challenging due to the difficulty of disentangling correlation and causation, as many drivers are correlated. Furthermore, because air pollution is controlled in part by large scale atmospheric phenomena, using local (e.g. individual grid cell level) covariates for analysis is insufficient to fully capture the effect of these phenomena on air pollution.

Access to large spatiotemporal datasets of air pollutant concentrations and atmospheric variables, coupled with recent advances in self-supervised learning, allow us to learn reduced representations of spatiotemporal atmospheric fields, and therefore account for non-local and non-instantaneous processes in downstream tasks.

We show that these learned reduced representations can be useful for tasks such as air pollution forecasting, and crucially to quantify the causal effect of varying atmospheric fields on air pollution. We make use of recent advances in bounding causal effects in the presence of unobserved confounding to estimate, with uncertainty, the causal effect of changing atmospheric fields on air pollution. Finally, we compare our quantification of the causal drivers of air pollution to results from other approaches, and explore implications for our methods and for the wider goal of improving the process-level treatment of air pollutants in chemistry-climate models.

How to cite: Hickman, S., Griffiths, P., Nowack, P., and Archibald, A.: Using reduced representations of atmospheric fields to quantify the causal drivers of air pollution, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6061, https://doi.org/10.5194/egusphere-egu23-6061, 2023.

EGU23-6306 | ECS | Orals | ITS1.14/CL5.8 | Highlight

Data-Driven Cloud Cover Parameterizations

Arthur Grundner, Tom Beucler, Pierre Gentine, Marco A. Giorgetta, Fernando Iglesias-Suarez, and Veronika Eyring

A promising approach to improve cloud parameterizations within climate models, and thus climate projections, is to train machine learning algorithms on storm-resolving model (SRM) output. The ICOsahedral Non-hydrostatic (ICON) modeling framework permits simulations ranging from numerical weather prediction to climate projections, making it an ideal target to develop data-driven parameterizations for sub-grid scale processes. Here, we systematically derive and evaluate the first data-driven cloud cover parameterizations with coarse-grained data based on ICON SRM simulations. These parameterizations range from simple analytic models and symbolic regression fits to neural networks (NNs), populating a performance x complexity plane. In most models, we enforce sparsity and discourage correlated features by sequentially selecting features based on the models' performance gains. Guided by a set of physical constraints, we use symbolic regression to find a novel equation to parameterize cloud cover. The equation represents a good compromise between performance and complexity, achieving the highest performance (R^2>0.9) for its complexity (13 trainable parameters). To model sub-grid scale cloud cover in its full complexity, we also develop three different types of NNs that differ in the degree of vertical locality they assume for diagnosing cloud cover from coarse-grained atmospheric state variables. Using the game-theory based interpretability library SHapley Additive exPlanations, we analyze our most non-local NN and identify an overemphasis on specific humidity and cloud ice as the reason why it cannot perfectly generalize from the global to the regional coarse-grained SRM data. The interpretability tool also helps visualize similarities and differences in feature importance between regionally and globally trained NNs, and reveals a local relationship between their cloud cover predictions and the thermodynamic environment. Our results show the potential of deep learning and symbolic regression to derive accurate yet interpretable cloud cover parameterizations from SRMs.

How to cite: Grundner, A., Beucler, T., Gentine, P., Giorgetta, M. A., Iglesias-Suarez, F., and Eyring, V.: Data-Driven Cloud Cover Parameterizations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6306, https://doi.org/10.5194/egusphere-egu23-6306, 2023.

EGU23-6450 | ECS | Orals | ITS1.14/CL5.8

The key role of causal discovery to improve data-driven parameterizations in climate models

Fernando Iglesias-Suarez, Veronika Eyring, Pierre Gentine, Tom Beucler, Michael Pritchard, Jakob Runge, and Breixo Solino-Fernandez

Earth system models are fundamental to understanding and projecting climate change, although there are considerable biases and uncertainties in their projections. A large contribution to this uncertainty stems from differences in the representation of clouds and convection occurring at scales smaller than the resolved model grid. These long-standing deficiencies in cloud parameterizations have motivated developments of computationally costly global high-resolution cloud resolving models, that can explicitly resolve clouds and convection. Deep learning can learn such explicitly resolved processes from cloud resolving models. While unconstrained neural networks often learn non-physical relationships that can lead to instabilities in climate simulations, causally-informed deep learning can mitigate this problem by identifying direct physical drivers of subgrid-scale processes. Both unconstrained and causally-informed neural networks are developed using a superparameterized climate model in which deep convection is explicitly resolved, and are coupled to the climate model. Prognostic climate simulations with causally-informed neural network parameterization are stable, accurately represent mean climate and variability of the original climate model, and clearly outperform its non-causal counterpart. Combining causal discovery and deep learning is a promising approach to improve data-driven parameterizations (informed by causally-consistent physical fields) for both their design and trustworthiness.

How to cite: Iglesias-Suarez, F., Eyring, V., Gentine, P., Beucler, T., Pritchard, M., Runge, J., and Solino-Fernandez, B.: The key role of causal discovery to improve data-driven parameterizations in climate models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6450, https://doi.org/10.5194/egusphere-egu23-6450, 2023.

EGU23-7457 | ECS | Posters on site | ITS1.14/CL5.8

Towards the effective autoencoder architecture to detect weather anomalies

Dusan Fister, Jorge Pérez-Aracil, César Peláez-Rodríguez, Marie Drouard, Pablo G. Zaninelli, David Barriopedro Cepero, Ricardo García-Herrera, and Sancho Salcedo-Sanz

To organise weather data as images, pixels represent coordinates and magnitude of pixels represents the state of the observed variable in a given time. Observed variables, such as air temperature, mean sea level pressure, wind components and others, may be collected into higher dimensional images or even into a motion structure. Codification of formers as a spatial and the latter as a spatio-temporal allows them to be processed using the deep learning methods, for instance autoencoders and autoencoder-like architectures. The objective of the original autoencoder is to reproduce the input image as much as possible, thus effectively equalising the input and output during the training. Then, an advantage of autoencoder can be utilised to calculate the deviations between (1) true states (effectively the inputs), which are derived by nature, and the (2) expected states, which are derived by means of statistical learning. Calculated deviations can then be interpreted to identify the extreme events, such as heatwaves, hot days or any other rare events (so-called anomalies). Additionally, by modelling deviations by statistical distributions, geographical areas with higher probabilities of anomalies can be deduced at the tails of the distribution. The capability of reproduction of the (original input) images is hence crucial in order to avoid addressing arbitrary noise as anomaly. We would like to run experiments to realise the effective architecture that give reasonable solutions, verify the benefits of implementing the variational autoencoder, realise the effect of selecting various statistical loss functions, and find out the effective architecture of the decoder part of the autoencoder.

How to cite: Fister, D., Pérez-Aracil, J., Peláez-Rodríguez, C., Drouard, M., G. Zaninelli, P., Barriopedro Cepero, D., García-Herrera, R., and Salcedo-Sanz, S.: Towards the effective autoencoder architecture to detect weather anomalies, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7457, https://doi.org/10.5194/egusphere-egu23-7457, 2023.

EGU23-7465 | ECS | Posters on site | ITS1.14/CL5.8

Invertible neural networks for satellite retrievals of aerosol optical depth

Paolo Pelucchi, Jorge Vicent, J. Emmanuel Johnson, Philip Stier, and Gustau Camps-Valls

The retrieval of atmospheric aerosol properties from satellite remote sensing is a complex and under-determined inverse problem. Traditional retrieval algorithms, based on radiative transfer models, must make approximations and assumptions to reach a unique solution or repeatedly use the expensive forward models to be able to quantify uncertainty. The recently introduced Invertible Neural Networks (INNs), a machine learning method based on Normalizing Flows, appear particularly suited for tackling inverse problems. They simultaneously model both the forward and the inverse branches of the problem, and their generative aspect allows them to efficiently provide non-parametric posterior distributions for the retrieved parameters, which can be used to quantify the retrieval uncertainty. So far INNs have successfully been applied to low-dimensional idealised inverse problems and even to some simpler scientific retrieval problems. Still, satellite aerosol retrievals present particular challenges, such as the high variability of the surface reflectance signal and the often comparatively small aerosol signal in the top-of-the-atmosphere (TOA) measurements.

In this study, we investigate the use of INNs for retrieving aerosol optical depth (AOD) and its uncertainty estimates at the pixel level from MODIS TOA reflectance measurements. The models are trained with custom synthetic datasets of TOA reflectance-AOD pairs made by combining the MODIS Dark Target algorithm’s atmospheric look-up tables and a MODIS surface reflectance product. The INNs are found to perform emulation and inversion of the look-up tables successfully. We initially train models adapted to different surface types by focusing our application on limited regional and seasonal contexts. The models are applied to real measurements from the MODIS sensor, and the generated AOD retrievals and posterior distributions are compared to the corresponding Dark Target and AERONET retrievals for evaluation and discussion.

How to cite: Pelucchi, P., Vicent, J., Johnson, J. E., Stier, P., and Camps-Valls, G.: Invertible neural networks for satellite retrievals of aerosol optical depth, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7465, https://doi.org/10.5194/egusphere-egu23-7465, 2023.

EGU23-7862 | Posters on site | ITS1.14/CL5.8

Predicting spatial precipitation extremes with deep learning models. A comparison of existing model architectures.

Pascal Horton and Noelia Otero

The rapid development of deep learning approaches has conquered many fields, and precipitation prediction is one of them. Precipitation modeling remains a challenge for numerical weather prediction or climate models, and parameterization is required for low spatial resolution models, such as those used in climate change impact studies. Machine learning models have been shown to be capable of learning the relationships between other meteorological variables and precipitation. Such models are much less computationally intensive than explicit modeling of precipitation processes and are becoming more accurate than parametrization schemes.

Most existing applications focus either on precipitation extremes aggregated over a domain of interest or on average precipitation fields. Here, we are interested in spatial extremes and focus on the prediction of heavy precipitation events (>95th percentile) and extreme events (>99th percentile) over the European domain. Meteorological variables from ERA5 are used as input, and E-OBS data as target. Different architectures from the literature are compared in terms of predictive skill for average precipitation fields as well as for the occurrence of heavy or extreme precipitation events (threshold exceedance). U-Net architectures show higher skills than other variants of convolutional neural networks (CNN). We also show that a shallower U-Net architecture performs as well as the original network for this application, thus reducing the model complexity and, consequently, the computational resources. In addition, we analyze the number of inputs based on the importance of the predictors provided by a layer-wise relevance propagation procedure.

How to cite: Horton, P. and Otero, N.: Predicting spatial precipitation extremes with deep learning models. A comparison of existing model architectures., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7862, https://doi.org/10.5194/egusphere-egu23-7862, 2023.

EGU23-8085 | ECS | Posters on site | ITS1.14/CL5.8

Improving the spatial accuracy of extreme tropical cyclone rainfall in ERA5 using deep learning

Guido Ascenso, Andrea Ficchì, Leone Cavicchia, Enrico Scoccimarro, Matteo Giuliani, and Andrea Castelletti

Tropical cyclones (TCs) are one of the costliest and deadliest natural disasters due to the combination of their strong winds and induced storm surges and heavy precipitation, which can cause devastating floods. Unfortunately, due to its high spatio-temporal variability, complex underlying physical process, and lack of high-quality observations, precipitation is still one of the most challenging aspects of a TC to model. However, as precipitation is a key forcing variable for hydrological processes acting across multiple space-time scales, accurate precipitation input is crucial for reliable hydrological simulations and forecasts.

A popular source of precipitation data is the ERA5 reanalysis dataset, frequently used as input to hydrological models when studying floods. However, ERA5 systematically underestimates TC-induced precipitation compared to MSWEP, a multi-source observational dataset fusing gauge, satellite, and reanalysis-based data, currently one of the most accurate precipitation datasets. Moreover, the spatial distribution of TC-rainfall in ERA5 has large room for improvement.

Here, we present a precipitation correction scheme based on U-Net, a popular deep-learning architecture. Rather than only adjusting the per-pixel precipitation values at each timestep of a given TC, we explicitly design our model to also adjust the spatial distribution of the precipitation; to the best of our knowledge, we are the first to do so. The key novelty of our model is a custom-made loss function, based on the combination of the fractions skill score (FSS) and mean absolute error (MAE) metrics. We train and validate the model on 100k time steps (with an 80:20 train:test split) from global TC precipitation events. We show how a U-Net trained with our loss function can reduce the per-pixel MAE of ERA5 precipitation by nearly as much as other state-of-the-art methods, while surpassing them significantly in terms of improved spatial patterns of precipitation. Finally, we discuss how the outputs of our model can be used for future research.

How to cite: Ascenso, G., Ficchì, A., Cavicchia, L., Scoccimarro, E., Giuliani, M., and Castelletti, A.: Improving the spatial accuracy of extreme tropical cyclone rainfall in ERA5 using deep learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8085, https://doi.org/10.5194/egusphere-egu23-8085, 2023.

EGU23-8496 | ECS | Posters on site | ITS1.14/CL5.8

Utilizing AI emulators to Model Stratospheric Aerosol Injections and their Effect on Climate

Eshaan Agrawal and Christian Schroder de Witt

With no end to anthropogenic greenhouse gas emissions in sight, policymakers are increasingly debating artificial mechanisms to cool the earth's climate. One such solution is stratospheric atmospheric injections (SAI), a method of solar geoengineering where particles are injected into the stratosphere in order to reflect the sun’s rays and lower global temperatures. Past volcanic events suggest that SAI can lead to fast substantial surface temperature reductions, and it is projected to be economically feasible. Research in simulation, however, suggests that SAI can lead to catastrophic side effects. It is also controversial among politicians and environmentalists because of the numerous challenges it poses geopolitically, environmentally, and for human health. Nevertheless, SAI is increasingly receiving attention from policymakers. In this research project, we use deep reinforcement learning to study if, and by how much, carefully engineered temporally and spatially varying injection strategies can mitigate catastrophic side effects of SAI. To do this, we are using the HadCM3 global circulation model to collect climate system data in response to artificial longitudinal aerosol injections. We then train a neural network emulator on this data, and use it to learn optimal injection strategies under a variety of objectives by alternating model updates with reinforcement learning. We release our dataset and code as a benchmark dataset to improve emulator creation for solar aerosol engineering modeling.

How to cite: Agrawal, E. and Schroder de Witt, C.: Utilizing AI emulators to Model Stratospheric Aerosol Injections and their Effect on Climate, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8496, https://doi.org/10.5194/egusphere-egu23-8496, 2023.

EGU23-8615 | Orals | ITS1.14/CL5.8 | Highlight

Machine learning applications for weather and climate need greater focus on extremes

Peter Watson

Multiple studies have now demonstrated that machine learning (ML) can give improved skill for simulating fairly typical weather events in climate simulations, for tasks such as downscaling to higher resolution and emulating and speeding up expensive model parameterisations. Many of these used ML methods with very high numbers of parameters, such as neural networks, which are the focus of the discussion here. Not much attention has been given to the performance of these methods for extreme event severities of relevance for many critical weather and climate prediction applications, with return periods of more than a few years. This leaves a lot of uncertainty about the usefulness of these methods, particularly for general purpose models that must perform reliably in extreme situations. ML models may be expected to struggle to predict extremes due to there usually being few samples of such events.

This presentation will review the small number of studies that have examined the skill of machine learning methods in extreme weather situations. It will be shown using recent results that machine learning methods that perform reasonably for typical weather events can have very large errors in extreme situations, highlighting the necessity of testing the performance for these cases. Extrapolation to extremes is found to work well in some studies, however.

It will be argued that more attention needs to be given to performance for extremes in work applying ML in climate science. Research gaps that seem particularly important are identified. These include investigating the behaviour of ML systems in events that are multiple standard deviations beyond observed records, which have occurred in the past, and evaluating performance of complex generative models in extreme events. Approaches to address these problems will be discussed.

How to cite: Watson, P.: Machine learning applications for weather and climate need greater focus on extremes, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8615, https://doi.org/10.5194/egusphere-egu23-8615, 2023.

EGU23-8661 | Posters on site | ITS1.14/CL5.8

An urban climate neural network screening tool

Robert von Tils and Sven Wiemers

Microscale RANS (Reynolds Averaged Navier Stokes) models are able to simulate the urban climate for entire large cities with a high spatial resolution of up to 5 m horizontally. They do this using data from geographic information systems (GIS) that must be specially processed to provide the models with information about the terrain, buildings, land use, and resolved vegetation. If high-performance computers, for example from research institutions, are not available for the simulations or are beyond the financial scope, the calculation on commercially available servers can take several weeks. The calculation of a reference initial state for a city is often followed by questions regarding adaptation measures due to climate change or the influence of smaller and larger future building developments on the urban climate. These changes lead locally to a change of the urban climate but are also influenced by the urban climate itself.

In order to save computational time and to comfortably give a quantitative fast initial assessment, we trained a neural network that predicts the simulation results of a RANS model (for example: air temperature at night and during the day, wind speed, cold air flow) and implemented this network in a GIS. The tool allows to calculate the impact of development projects on the urban climate in a fraction of the time required by a RANS simulation and comes close to the RANS model in terms of accuracy. It can also be used by people without in-depth knowledge of urban climate modeling and is therefore particularly suitable for use, for example, in specialized offices of administrative departments or by project developers.

How to cite: von Tils, R. and Wiemers, S.: An urban climate neural network screening tool, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8661, https://doi.org/10.5194/egusphere-egu23-8661, 2023.

EGU23-8666 | ECS | Posters on site | ITS1.14/CL5.8

Drivers of Natural Gas Use in United States Buildings

Rohith Teja Mittakola, Philippe Ciais, Jochen Schubert, David Makowski, Chuanlong Zhou, Hassan Bazzi, Taochun Sun, Zhu Liu, and Steven Davis

Natural gas is the primary fuel used in U.S. residences, especially during winter, when cold temperatures drive the heating demand. In this study, we use daily county-level gas consumption data to assess the spatial patterns of the relationships and sensitivities of gas consumption by U.S. households considering outdoor temperatures. Linear-plus-plateau functions are found to be the best fit for gas consumption and are applied to derive two key coefficients for each county: the heating temperature threshold (T_crit) below which residential heating starts and the rate of increase in gas consumption when the outdoor temperature drops by one degree (Slope). We then use interpretable machine learning models to evaluate the key building properties and socioeconomic factors related to the spatial patterns of T_crit and Slope based on a large database of individual household properties and population census data. We find that building age, employment rates, and household size are the main predictors of T_crit, whereas the share of gas as a heating fuel and household income are the main predictors of Slope. The latter result suggests inequalities across the U.S. with respect to gas consumption, with wealthy people living in well-insulated houses associated with low T_crit and Slope values. Finally, we estimate potential reductions in gas use in U.S. residences due to improvements in household insulation or a hypothetical behavioral change toward reduced consumption by adopting a 1°C lower T_crit than the current value and a reduced slope. These two scenarios would result in 25% lower gas consumption at the national scale, avoiding 1.24 million MtCO₂ of emissions per year. Most of these reductions occur in the Midwest and East Coast regions. The results from this study provide new quantitative information for targeting efforts to reduce household gas use and related CO₂ emissions in the U.S.

How to cite: Mittakola, R. T., Ciais, P., Schubert, J., Makowski, D., Zhou, C., Bazzi, H., Sun, T., Liu, Z., and Davis, S.: Drivers of Natural Gas Use in United States Buildings, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8666, https://doi.org/10.5194/egusphere-egu23-8666, 2023.

EGU23-8921 | ECS | Posters on site | ITS1.14/CL5.8

Identification of sensitive regions to climate change and anticipation of climate events in Brazil

Angelica Caseri and Francisco A. Rodrigues

In Brazil, the water system is essential for the electrical system and agribusiness. Understanding climate changes and predicting long-term hydrometeorological phenomena is vital for developing and maintaining these sectors in the country. This work aims to use data from the SIN system (National Interconnected System) in Brazil, from the main hydrological basins, as well as historical rainfall data, in complex networks and deep learning algorithms, to identify possible climate changes in Brazil and predict future hydrometeorological phenomena. Through the methodology developed in this work, the predictions generated showed satisfactory results, which allows identifying regions more sensitive to climate change and anticipating climate events. This work is expected to help the energy generation system in Brazil and the agronomy sector, the main sectors that drive the country's economy.

How to cite: Caseri, A. and A. Rodrigues, F.: Identification of sensitive regions to climate change and anticipation of climate events in Brazil, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8921, https://doi.org/10.5194/egusphere-egu23-8921, 2023.

EGU23-9337 | ECS | Posters on site | ITS1.14/CL5.8

Modeling landscape-scale vegetation response to climate: Synthesis of the EarthNet challenge

Vitus Benson, Christian Requena-Mesa, Claire Robin, Lazaro Alonso, Nuno Carvalhais, and Markus Reichstein

The biosphere displays high heterogeneity at landscape-scale. Vegetation modelers struggle to represent this variability in process-based models because global observations of micrometeorology and plant traits are not available at such fine granularity. However, remote sensing data is available: the Sentinel 2 satellites with a 10m resolution capture aspects of localized vegetation dynamics. The EarthNet challenge (EarthNet2021, [1]) aims at predicting satellite imagery conditioned on coarse-scale weather data. Multiple research groups approached this challenge with deep learning [2,3,4]. Here, we evaluate how well these satellite image models simulate the vegetation response to climate, where the vegetation status is approximated by the NDVI vegetation index.

Achieving the new vegetation-centric evaluation requires three steps. First, we update the original EarthNet2021 dataset to be suitable for vegetation modeling: EarthNet2021x includes improved georeferencing, a land cover map, and a more effective cloud mask. Second, we introduce the interpretable evaluation metric VegetationScore: the Nash Sutcliffe model efficiency (NSE) of NDVI predictions over clear-sky observations per vegetated pixel aggregated through normalization to dataset level. The ground truth NDVI time series achieves a VegetationScore of 1, the target period mean NDVI a VegetationScore of 0. Third, we assess the skill of two deep neural networks with the VegetationScore: ConvLSTM [2,3], which combines convolutions and recurrency, and EarthFormer [4], a Transformer adaptation for Earth science problems.

Both models significantly outperform the persistence baseline. They do not display systematic biases and generally catch spatial patterns. Yet, both neural networks achieve a negative VegetationScore. Only in about 20% of vegetated pixels, the deep learning models do beat a hypothetical model predicting the true target period mean NDVI. This is partly because models largely underestimate the temporal variability. However, the target variability may partially be inflated by the noisy nature of the observed NDVI. Additionally, increasing uncertainty for longer lead times decreases scores: the mean RMSE in the first 25 days is 50% lower than between 75 and 100 days lead time. In general, consistent with the EarthNet2021 leaderboard, the EarthFormer outperforms the ConvLSTM. With EarthNet2021x, a more narrow perspective to the EarthNet challenge is introduced. Modeling localized vegetation response is a task that requires careful adjustments of off-the-shelf computer vision architectures for them to excel. The resulting specialized approaches can then be used to advance our understanding of the complex interactions between vegetation and climate.

[1] Requena-Mesa, Benson, Reichstein, Runge and Denzler. EarthNet2021: A large-scale dataset and challenge for Earth surface forecasting as a guided video prediction task. CVPR Workshops, 2021.

[2] Diaconu, Saha, Günnemann and Zhu. Understanding the Role of Weather Data for Earth Surface Forecasting Using a ConvLSTM-Based Model. CVPR Workshops, 2022.

[3] Kladny, Milanta, Mraz, Hufkens and Stocker. Deep learning for satellite image forecasting of vegetation greenness. bioRxiv, 2022.

[4] Gao, Shi, Wang, Zhu, Wang, Li and Yeung. Earthformer: Exploring Space-Time Transformers for Earth System Forecasting. NeurIPS, 2022.

How to cite: Benson, V., Requena-Mesa, C., Robin, C., Alonso, L., Carvalhais, N., and Reichstein, M.: Modeling landscape-scale vegetation response to climate: Synthesis of the EarthNet challenge, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9337, https://doi.org/10.5194/egusphere-egu23-9337, 2023.

EGU23-9434 | ECS | Posters on site | ITS1.14/CL5.8

Enhancing environmental sensor data quality control with graph neural networks

Elżbieta Lasota, Julius Polz, Christian Chwala, Lennart Schmidt, Peter Lünenschloß, David Schäfer, and Jan Bumberger

The rapidly growing number of low-cost environmental sensors and data from opportunistic sensors constantly advances the quality as well as the spatial and temporal resolution of weather and climate models. However, it also leads to the need for effective tools to ensure the quality of collected data.

Time series quality control (QC) from multiple spatial, irregularly distributed sensors is a challenging task, as it requires the simultaneous integration and analysis of observations from sparse neighboring sensors and consecutive time steps. Manual QC is very often time- and labour- expensive and requires expert knowledge, which introduces subjectivity and limits reproducibility. Therefore, automatic, accurate, and robust QC solutions are in high demand, where among them one can distinguish machine learning techniques.

In this study, we present a novel approach for the quality control of time series data from multiple spatial, irregularly distributed sensors using graph neural networks (GNNs). Although we applied our method to commercial microwave link attenuation data collected from a network in Germany between April and October 2021, our solution aims to be generic with respect to the number and type of sensors, The proposed approach involves the use of an autoencoder architecture, where the GNN is used to model the spatial relationships between the sensors, allowing for the incorporation of contextual information in the quality control process.

While our model shows promising results in initial tests, further research is needed to fully evaluate its effectiveness and to demonstrate its potential in a wider range of environmental applications. Eventually, our solution will allow us to further foster the observational basis of our understanding of the natural environment.

How to cite: Lasota, E., Polz, J., Chwala, C., Schmidt, L., Lünenschloß, P., Schäfer, D., and Bumberger, J.: Enhancing environmental sensor data quality control with graph neural networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9434, https://doi.org/10.5194/egusphere-egu23-9434, 2023.

EGU23-9810 | ECS | Orals | ITS1.14/CL5.8

Integration of a deep-learning-based fire model into a global land surface model

Rackhun Son, Nuno Carvalhais, Lazaro Silva, Christian Requena-Mesa, Ulrich Weber, Veronika Gayler, Tobias Stacke, Reiner Schnur, Julia Nabel, Alexander Winkler, and Sönke Zaehle

Fire is an ubiquitous process within the Earth system that has significant impacts in terrestrial ecosystems. Process-based fire models quantify fire disturbance effects in stand-alone dynamic global vegetation models (DGVMs) and within coupled Earth system models (ESMs), and their advances have incorporated both descriptions of natural processes and anthropogenic drivers. However, we still observe a limited skill in modeling and predicting fire at global scale, mostly due to the stochastic nature of fire, but also due to the limits in empirical parameterizations in these process-based models. As an alternative, statistical approaches have shown the advantages of machine learning in providing robust diagnostics of fire damages, though with limited value for process-based modeling frameworks. Here, we develop a deep-learning-based fire model (DL-fire) to estimate gridded burned area fraction at global scale and couple it within JSBACH4, the land surface model used in the ICON ESM. We compare the resulting hybrid model integrating DL-fire into JSBACH4 (JDL-fire) against the standard fire model within JSBACH4 and the stand-alone DL-fire results. The stand-alone DL-fire model forced with observations shows high performance in simulating global burnt fraction, showing a monthly correlation (R_m) with the Global Fire Emissions Database (GFED4) of 0.78 and of 0.8 at global scale during the training (2004-10) and validation periods (2011-15), respectively. The performance remains nearly the same when evaluating the hybrid modeling approach JDL-fire (R_m=0.76 and 0.86 in training and evaluation periods, respectively). This outperforms the currently used standard fire model in JSBACH4 (R_m=-0.16 and 0.22 in training and evaluation periods, respectively) by far. We further evaluate the modeling results across specific fire regions and apply layer-wise relevance propagation (LRP) to quantify importance of each predictor. Overall, land properties, such as fuel amount and water contents in soil layers, stand out as the major factors determining burnt fraction in DL-fire, paralleled by meteorological conditions, over tropical and high latitude regions. Our study demonstrates the potential of hybrid modeling in advancing the predictability of Earth system processes by integrating statistical learning approaches in physics-based dynamical systems.

How to cite: Son, R., Carvalhais, N., Silva, L., Requena-Mesa, C., Weber, U., Gayler, V., Stacke, T., Schnur, R., Nabel, J., Winkler, A., and Zaehle, S.: Integration of a deep-learning-based fire model into a global land surface model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9810, https://doi.org/10.5194/egusphere-egu23-9810, 2023.

EGU23-10219 | ECS | Posters on site | ITS1.14/CL5.8

Identifying compound weather prototypes of forest mortality with β-VAE

Mohit Anand, Friedrich Bohn, Lily-belle Sweet, Gustau Camps-Valls, and Jakob Zscheischler

Forest health is affected by many interacting and correlated weather variables over multiple temporal scales. Climate change affects weather conditions and their dependencies. To better understand future forest health and status, an improved scientific understanding of the complex relationships between weather conditions and forest mortality is required. Explainable AI (XAI) methods are increasingly used to understand and simulate physical processes in complex environments given enough data. In this work, an hourly weather generator (AWE-GEN) is used to simulate 200,000 years of daily weather conditions representative of central Germany. It is capable of simulating low and high-frequency characteristics of weather variables and also captures the inter-annual variability of precipitation. These data are then used to drive an individual-based forest model (FORMIND) to simulate the dynamics of a beech, pine, and spruce forest. A variational autoencoder β-VAE is used to learn representations of the generated weather conditions, which include radiation, precipitation and temperature. We learn shared and specific variable latent representations using a decoder network which remains the same for all the weather variables. The representation learning is completely unsupervised. Using the output of the forest model, we identify single and compounding weather prototypes that are associated with extreme forest mortality. We find that the prototypes associated with extreme mortality are similar for pine and spruce forests and slightly different for beech forests. Furthermore, although the compounding weather prototypes represent a larger sample size (2.4%-3.5%) than the single prototypes (1.7%-2.2%), they are associated with higher levels of mortality on average. Overall, our research illustrates how deep learning frameworks can be used to identify weather patterns that are associated with extreme impacts.

How to cite: Anand, M., Bohn, F., Sweet, L., Camps-Valls, G., and Zscheischler, J.: Identifying compound weather prototypes of forest mortality with β-VAE, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10219, https://doi.org/10.5194/egusphere-egu23-10219, 2023.

EGU23-10391 | ECS | Posters virtual | ITS1.14/CL5.8

Simulation and reconstruction of global monthly runoff based on hydrological models and machine learning models

Jiawen Zhang and Jianyu Liu

Hydrological models and machine learning models are widely used in streamflow simulation and data reconstruction. However, a global assessment of these models is still lacking and no synthesized catchment-scale streamflow product derived from multiple models is available over the globe. In this study, we comprehensively evaluated four conceptual hydrological models (GR2M, XAJ, SAC, Alpine) and four machine learning models (RF, GBDT, DNN, CNN) based on the selected 16,218 gauging stations worldwide, and then applied multi-model weighting ensemble (MWE) method to merge streamflow simulated from these models. Generally, the average performance of the machine learning model for all stations is better than that of the hydrological model, and with more stations having a quantified simulation accuracy (KGE>0.2); However, the hydrological model achieves a higher percentage of stations with a good simulation accuracy (KGE>0.6). Specifically, for the average accuracy during the validation period, there are 67% (27%) and 74% (21%) of stations showed a “quantified” (“good”) level for the hydrological models and machine learning models, respectively. The XAJ is the best-performing model of the four hydrological models, particularly in tropical and temperate zones. Among the machine learning models, the GBDT model shows better performance on the global scale. The MWE can effectively improve the simulation accuracy and perform much better than the traditional multi-model arithmetic ensemble (MAE), especially for the constrained least squares prediction combination method (CLS) with 82% (28%) of the stations having a “qualified” (“good”) accuracy. Furthermore, by exploring the influencing factors of the streamflow simulation, we found that both machine-learning models and hydrological models perform better in wetter areas.

How to cite: Zhang, J. and Liu, J.: Simulation and reconstruction of global monthly runoff based on hydrological models and machine learning models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10391, https://doi.org/10.5194/egusphere-egu23-10391, 2023.

EGU23-10431 | Orals | ITS1.14/CL5.8

Developing hybrid precipitation nowcasting model with WRF and conditional GAN-based model

Suyeon Choi and Yeonjoo Kim

Physical process-based numerical prediction models (NWPs) and radar-based probabilistic methods have been mainly used for short-term precipitation prediction. Recently, radar-based precipitation nowcasting models using advanced machine learning (ML) have been actively developed. Although the ML-based model shows outstanding performance in short-term rainfall prediction, it significantly decreases performance due to increased lead time. It has the limitation of being a black box model that does not consider the physical process of the atmosphere. To address these limitations, we aimed to develop a hybrid precipitation nowcasting model, which combines NWP and an advanced ML-based model via an ML-based ensemble method. The Weather Research and Forecasting (WRF) model was used as NWP to generate a physics-based rainfall forecast. In this study, we developed the ML-based precipitation nowcasting model with conditional Generative Adversarial Network (cGAN), which shows high performance in the image generation tasks. The radar reflectivity data, WRF hindcast meteorological outputs (e.g., temperature and wind speed), and static information of the target basin (e.g., DEM, Land cover) were used as input data of cGAN-based model to generate physics-informed rainfall prediction at the lead time up to 6 hours. The cGAN-based model was trained with the data for the summer season of 2014-2017. In addition, we proposed an ML-based blending method, i.e., XGBoost, that combines cGAN-based model results and WRF forecast results. To evaluate the hybrid model performance, we analyzed the performance of precipitation predictions on three heavy rain events in South Korea. The results confirmed that using the blending method to develop a hybrid model could provide an improved precipitation nowcasting approach.

Acknowledgements

This work was supported by a grant from the National Research Foundation of Korea funded by the Ministry of Science, ICT & Future Planning (2020R1A2C2007670).

How to cite: Choi, S. and Kim, Y.: Developing hybrid precipitation nowcasting model with WRF and conditional GAN-based model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10431, https://doi.org/10.5194/egusphere-egu23-10431, 2023.

EGU23-10568 | ECS | Orals | ITS1.14/CL5.8

Extended-range predictability of stratospheric extreme events using explainable neural networks

Zheng Wu, Tom Beucler, and Daniela Domeisen

Extreme stratospheric events such as extremely weak vortex events and strong vortex events can influence weather in the troposphere from weeks to months and thus are important sources of predictability of tropospheric weather on subseasonal to seasonal (S2S) timescales. However, the predictability of weak vortex events is limited to 1-2 weeks in state-of-the-art forecasting systems, while strong vortex events are more predictable than weak vortex events. Longer predictability timescales of the stratospheric extreme events would benefit long-range surface weather prediction. Recent studies showed promising results in the use of machine learning for improving weather prediction. The goal of this study is to explore the potential of a machine learning approach in extending the predictability of stratospheric extreme events in S2S timescales. We use neural networks (NNs) to predict the monthly stratospheric polar vortex strength with lead times up to five months using the first five principal components (PCs) of the sea surface temperature (SST), mean sea level pressure (MSLP), Barents–Kara sea-ice concentration (BK-SIC), poleward heat flux at 100 hPa, and zonal wind at 50, 30, and 2 hPa as precursors. These physical variables are chosen as they are indicated as potential precursors for the stratospheric extremes in previous studies. The results show that the accuracy and Brier Skill Score decrease with longer lead times and the performance is similar between weak and strong vortex events. We then employ two different NN attribution methods to uncover feature importance (heat map) in the inputs for the NNs, which indicates the relevance of each input for NNs to make the prediction. The heat maps suggest that precursors from the lower stratosphere are important for the prediction of the stratospheric polar vortex strength with a lead time of one month while the precursors at the surface and the upper stratosphere become more important with lead times longer than one month. This result is overall consistent with the previous studies that subseasonal precursors to the stratospheric extreme events may come from the lower troposphere. Our study sheds light on the potential of explainable NNs in searching for opportunities for skillful prediction of stratospheric extreme events and, by extension, surface weather on S2S timescales.

How to cite: Wu, Z., Beucler, T., and Domeisen, D.: Extended-range predictability of stratospheric extreme events using explainable neural networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10568, https://doi.org/10.5194/egusphere-egu23-10568, 2023.

EGU23-11238 | ECS | Orals | ITS1.14/CL5.8 | Highlight

Seasonal forecasts of wildfire frequency and burned area in the western United States using a stochastic machine learning fire model

Jatan Buch, A. Park Williams, and Pierre Gentine

One of the main challenges for forecasting fire activity is the tradeoff between accuracy at finer spatial scales relevant to local decision making and predictability over seasonal (next 2-4 months) and subseasonal-to-seasonal (next 2 weeks to 2 months) timescales. To achieve predictability at long lead times and high spatial resolution, several analyses in the literature have constructed statistical models of fire activity using only antecedent climate predictors. However, in this talk, I will present preliminary seasonal forecasts of wildfire frequency and burned area for the western United States using SMLFire1.0, a stochastic machine learning (SML) fire model, that relies on both observed antecedent climate and vegetation predictors and seasonal forecasts of fire month climate. In particular, I will discuss results obtained by forcing the SMLFire1.0 model with seasonal forecasts from: a) downscaled and bias-corrected North American Multi-Model Ensemble (NMME) outputs, and b) skill-weighted climate analogs constructed using an autoregressive ML model. I will also comment upon the relative contribution of uncertainties, from climate forecasts and fire model simulations respectively, in projections of wildfire frequency and burned area across several spatial scales and lead times.

How to cite: Buch, J., Williams, A. P., and Gentine, P.: Seasonal forecasts of wildfire frequency and burned area in the western United States using a stochastic machine learning fire model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11238, https://doi.org/10.5194/egusphere-egu23-11238, 2023.

EGU23-11355 | Posters on site | ITS1.14/CL5.8

Estimation of Fine Dust Concentration from BGR Images in Surveillance Cameras

Hoyoung Cha, Jongyun Byun, Jongjin Baik, and Changhyun Jun

This study proposes a novel approach on estimation of fine dust concentration from raw video data recorded by surveillance cameras. At first, several regions of interest are defined from specific images extracted from videos in surveillance cameras installed at Chung-Ang University. Among them, sky fields are mainly considered to figure out changes in characteristics of each color. After converting RGB images into BGR images, a number of discrete pixels with brightness intensities in a blue channel is mainly analyzed by investigating any relationships with fine dust concentration measured from automatic monitoring stations near the campus. Here, different values of thresholds from 125 to 200 are considered to find optimal conditions from changes in values of each pixel in the blue channel. This study uses the Pearson correlation coefficient to calculate the correlation between the number of pixels with values over the selected threshold and observed data for fine dust concentration. As an example on one specific date, the coefficients reflect their positive correlations with a range from 0.57 to 0.89 for each threshold. It should be noted that this study is a novel attempt to suggest a new, simple, and efficient method for estimating fine dust concentration from surveillance cameras common in many areas around the world.

Keywords: Fine Dust Concentration, BGR Image, Surveillance Camera, Threshold, Correlation Analysis

Acknowledgment

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. NRF-2022R1A4A3032838) and this work was funded by the Korea Meteorological Administration Research and Development Program under Grant KMI2022-01910 and this work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (2020R1G1A1013624).

How to cite: Cha, H., Byun, J., Baik, J., and Jun, C.: Estimation of Fine Dust Concentration from BGR Images in Surveillance Cameras, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11355, https://doi.org/10.5194/egusphere-egu23-11355, 2023.

EGU23-12137 | ECS | Posters on site | ITS1.14/CL5.8

Identifying mechanisms of low-level jets near coast of Kurzeme using Principal Component Analysis

Maksims Pogumirskis, Tija Sīle, and Uldis Bethers

Low-level jets are maximums in the vertical profile of the wind speed profile in the lowest levels of atmosphere. Low-level jets, when present, can make a significant impact on the wind energy. Wind conditions in low-level jets depart from traditional assumptions about wind profile and low-level jets can also influence the stability and turbulence that are important for wind energy applications.

In literature commonly an algorithm of identifying low-level jets is used to estimate frequency of low-level jets. The algorithm searches for maximum in the lowest levels of the atmosphere with a temperature inversion above the jet maximum. The algorithm is useful in identifying the presence of the low-level jets and estimating their frequency. However, low-level jets can be caused by a number of different mechanisms which leads to differences in low-level jet characteristics. Therefore, additional analysis is necessary to distinguish between different types of jets and characterize their properties. We aim to automate this process using Principal Component Analysis (PCA) to identify main patterns of wind speed and temperature. By analyzing diurnal and seasonal cycles of these patterns a better understanding about climatology of low-level jets in the region can be gained.

This study focuses on the central part of the Baltic Sea. Several recent studies have identified the presence of low-level jets near the coast of Kurzeme. Typically, maximums of low-level jets are located several hundred meters above the surface, while near the coast of Kurzeme maximums of low-level jets are usually within the lowest 100 meters of the atmosphere.

Data from UERRA reanalysis with 11 km horizontal resolution on 12 height levels in the lowest 500 meters of atmosphere was used. The algorithm that identifies low-level jets was applied to the data, to estimate frequency of low-level jets in each grid cell of the model. Jet events were grouped by the wind direction to identify main trajectories of low-level jets in the region. Several atmosphere cross-sections that low-level jets frequently flow through were chosen for further analysis.

Model data was interpolated to the chosen cross-sections and PCA was applied to the cross-section data of wind speed, geostrophic wind speed and temperature. Main patterns of these meteorological parameters, such as wind speed maximum, temperature inversion above the surface of the sea and temperature difference between sea and land were identified by the PCA. Differences of principal components between cross-sections and diurnal and seasonal patterns of principal components helped to gain better understanding of climatology, extent and mechanisms of low-level jets in the region.

How to cite: Pogumirskis, M., Sīle, T., and Bethers, U.: Identifying mechanisms of low-level jets near coast of Kurzeme using Principal Component Analysis, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12137, https://doi.org/10.5194/egusphere-egu23-12137, 2023.

EGU23-12528 | ECS | Orals | ITS1.14/CL5.8

Evaluation of explainable AI solutions in climate science

Philine Bommer, Marlene Kretschmer, Anna Hedstroem, Dilyara Bareeva, and Marina M.-C. Hoehne

Explainable artificial intelligence (XAI) methods serve as a support for researchers to shed light onto the reasons behind the predictions made by deep neural networks (DNNs). XAI methods have already been successfully applied to climate science, revealing underlying physical mechanisms inherent in the studied data. However, the evaluation and validation of XAI performance is challenging as explanation methods often lack ground truth. As the number of XAI methods is growing, a comprehensive evaluation is necessary to enable well-founded XAI application in climate science.

In this work we introduce explanation evaluation in the context of climate research. We apply XAI evaluation to compare multiple explanation methods for a multi-layer percepton (MLP) and a convolutional neural network (CNN). Both MLP and CNN assign temperature maps to classes based on their decade. We assess the respective explanation methods using evaluation metrics measuring robustness, faithfulness, randomization, complexity and localization. Based on the results of a random baseline test we establish an explanation evaluation guideline for the climate community. We use this guideline to rank the performance in each property of similar sets of explanation methods for the MLP and CNN. Independent of the network type, we find that Integrated Gradients, Layer-wise relevance propagation and InputGradients exhibit a higher robustness, faithfulness and complexity compared to purely Gradient-based methods, while sacrificing reactivity to network parameters, i.e. low randomisation scores. The contrary holds for Gradient, SmoothGrad, NoiseGrad and FusionGrad. Another key observation is that explanations using input perturbations, such as SmoothGrad and Integrated Gradients, do not improve robustness and faithfulness, in contrast to theoretical claims. Our experiments highlight that XAI evaluation can be applied to different network tasks and offers more detailed information about different properties of explanation method than previous research. We demonstrate that using XAI evaluation helps to tackle the challenge of choosing an explanation method.

How to cite: Bommer, P., Kretschmer, M., Hedstroem, A., Bareeva, D., and Hoehne, M. M.-C.: Evaluation of explainable AI solutions in climate science, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12528, https://doi.org/10.5194/egusphere-egu23-12528, 2023.

EGU23-12657 | Orals | ITS1.14/CL5.8 | Highlight

DeepExtremes: Explainable Earth Surface Forecasting Under Extreme Climate Conditions

Karin Mora, Gunnar Brandt, Vitus Benson, Carsten Brockmann, Gustau Camps-Valls, Miguel-Ángel Fernández-Torres, Tonio Fincke, Norman Fomferra, Fabian Gans, Maria Gonzalez, Chaonan Ji, Guido Kraemer, Eva Sevillano Marco, David Montero, Markus Reichstein, Christian Requena-Mesa, Oscar José Pellicer Valero, Mélanie Weynants, Sebastian Wieneke, and Miguel D. Mahecha

Compound heat waves and drought events draw our particular attention as they become more frequent. Co-occurring extreme events often exacerbate impacts on ecosystems and can induce a cascade of detrimental consequences. However, the research to understand these events is still in its infancy. DeepExtremes is a project funded by the European Space Agency (https://rsc4earth.de/project/deepextremes/) aiming at using deep learning to gain insight into Earth surface under extreme climate conditions. Specifically, the goal is to forecast and explain extreme, multi-hazard, and compound events. To this end, the project leverages the existing Earth observation archive to help us better understand and represent different types of hazards and their effects on society and vegetation. The project implementation involves a multi-stage process consisting of 1) global event detection; 2) intelligent subsampling and creation of mini-data-cubes; 3) forecasting methods development, interpretation, and testing; and 4) cloud deployment and upscaling. The data products will be made available to the community following the reproducibility and FAIR data principles. By effectively combining Earth system science with explainable AI, the project contributes knowledge to advancing the sustainable management of consequences of extreme events. This presentation will show the progress made so far and specifically introduce how to participate in the challenges about spatio-temporal extreme event prediction in DeepExtremes.

How to cite: Mora, K., Brandt, G., Benson, V., Brockmann, C., Camps-Valls, G., Fernández-Torres, M.-Á., Fincke, T., Fomferra, N., Gans, F., Gonzalez, M., Ji, C., Kraemer, G., Marco, E. S., Montero, D., Reichstein, M., Requena-Mesa, C., Valero, O. J. P., Weynants, M., Wieneke, S., and Mahecha, M. D.: DeepExtremes: Explainable Earth Surface Forecasting Under Extreme Climate Conditions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12657, https://doi.org/10.5194/egusphere-egu23-12657, 2023.

EGU23-12889 | Orals | ITS1.14/CL5.8

New Berkeley Earth High Resolution Temperature Data Set

Robert A. Rohde and Zeke Hausfather

Berkeley Earth is premiering a new high resolution analysis of historical instrumental temperatures.

This builds on our existing work on climate reconstruction by adding a simple machine learning layer to our analysis. This new approach extracts weather patterns from model, satellite, and reanalysis data, and then layers these weather patterns on top of instrumental observations and our existing interpolation methods to produce new high resolution historical temperature fields. This has quadrupled our output resolution from the previous 1° x 1° lat-long to a new global 0.25° x 0.25° lat-long resolution. However, this is not simply a downscaling effort. Firstly, the use of weather patterns derived from physical models and observations increases the spatial realism of the reconstructed fields. Secondly, observations from regions with high density measurement networks have been directly incorporated into the high resolution field, allowing dense observations to be more fully utilized.

This new data product uses significantly more observational weather station data and produces higher resolution historical temperature fields than any comparable product, allowing for unprecedented insights into historical local and regional climate change. In particular, the effect of geographic features such as mountains, coastlines, and ecosystem variations are resolved with a level of detail that was not previously possible. At the same time, previously established techniques for bias corrections, noise reduction, and error analysis continued to be utilized. The resulting global field initially spans 1850 to present and will be updated on an ongoing basis. This project does not significantly change the global understanding of climate change, but helps to provide local detail that was often unresolved previously. The initial data product focuses on monthly temperatures, though a proposal exists to also create a high resolution daily temperature data set using similar methods.

This talk will describe the construction of the new data set and its characteristics. The techniques used in this project are accessible enough that they are likely to be useful for other types of instrumental analyses wishing to improve resolution or leverage basic information about weather patterns derived from models or other sources.

How to cite: Rohde, R. A. and Hausfather, Z.: New Berkeley Earth High Resolution Temperature Data Set, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12889, https://doi.org/10.5194/egusphere-egu23-12889, 2023.

EGU23-12948 | ECS | Orals | ITS1.14/CL5.8

Identifying drivers of river floods using causal inference

Peter Miersch, Shijie Jiang, Oldrich Rakovec, and Jakob Zscheischler

River floods are among the most devastating natural hazards, causing thousands of deaths and billions of euros in damages every year. Floods can result from a combination of compounding drivers such as heavy precipitation, snowmelt, and high antecedent soil moisture. These drivers and the processes they govern vary widely both between catchments and between flood events within a catchment, making a causal understanding of the underlying hydrological processes difficult.

Modern causal inference methods, such as the PCMCI framework, are able to identify drivers from complex time series through causal discovery and build causally aware statistical models. However, causal inference tailored to extreme events remains a challenge due to data length limitations. To overcome data limitations, here we bridge the gap between synthetic and real world data using 1,000 years of simulated weather to drive as state-of-the-art hydrological model (the mesoscale Hydrological Model, mHM) over a wide range of European catchments. From the simulated time series, we extract high runoff events, on which we evaluate the causal inference approach. We identify the minimum data necessary for obtaining robust causal models, evaluate metrics for model evaluation and comparison, and compare causal flood drivers across catchments. Ultimately, this work will help establish best practices in causal inference for flood research to identify meteorological and catchment specific flood drivers in a changing climate.

How to cite: Miersch, P., Jiang, S., Rakovec, O., and Zscheischler, J.: Identifying drivers of river floods using causal inference, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12948, https://doi.org/10.5194/egusphere-egu23-12948, 2023.

EGU23-13250 | ECS | Posters on site | ITS1.14/CL5.8

From MODIS cloud properties to cloud types using semi-supervised learning

Julien Lenhardt, Johannes Quaas, and Dino Sejdinovic

Clouds are classified into types, classes, or regimes. The World Meteorological Organization distinguishes stratus and cumulus clouds and three altitude layers. Cloud types exhibit very different radiative properties and interact in numerous ways with aerosol particles in the atmosphere. However, it has proven difficult to define cloud regimes objectively and from remote sensing data, hindering the understanding we have of the processes and adjustments involved.

Building on the method we previously developed, we combine synoptic observations and passive satellite remote-sensing retrievals to constitute a database of cloud types and cloud properties to eventually train a cloud classification algorithm. The cloud type labels come from the global marine meteorological observations dataset (UK Met Office, 2006) which is comprised of near-global synoptic observations. This data record reports back information about cloud type and other meteorological quantities at the surface. The cloud classification model is built on different cloud-top and cloud optical properties (Level 2 products MOD06/MYD06 from the MODIS sensor) extracted temporally close to the observation time and on a 128km x 128km grid around the synoptic observation location. To make full use of the large quantity of remote sensing data available and to investigate the variety in cloud settings, a convolutional variational auto-encoder (VAE) is applied as a dimensionality reduction tool in a first step. Furthermore, such model architecture allows to account for spatial relationships while describing non-linear patterns in the input data. The cloud classification task is subsequently performed drawing on the constructed latent representation of the VAE. Associating information from underneath and above the cloud enables to build a robust model to classify cloud types. For the training we specify a study domain in the Atlantic ocean around the equator and evaluate the method globally. Further experiments and evaluation are done on simulation data produced by the ICON model.

How to cite: Lenhardt, J., Quaas, J., and Sejdinovic, D.: From MODIS cloud properties to cloud types using semi-supervised learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13250, https://doi.org/10.5194/egusphere-egu23-13250, 2023.

EGU23-13462 | ECS | Orals | ITS1.14/CL5.8

Double machine learning for geosciences

Kai-Hendrik Cohrs, Gherardo Varando, Markus Reichstein, and Gustau Camps-Valls

Hybrid modeling describes the synergy between parametric models and machine learning [1]. Parts of a parametric equation are substituted by non-parametric machine learning models, which can then represent complex functions. These are inferred together with the parameters of the equation from the data. Hybrid modeling promises to describe complex relationships and to be scientifically interpretable. These promises, however, need to be taken with a grain of salt. With too flexible models, such as deep neural networks, the problem of equifinality arises: There is no identifiable optimal solution. Instead, many outcomes describe the data equally well, and we will obtain one of them by chance. Interpreting the result may lead to erroneous conclusions. Moreover, studies have shown that regularization techniques can introduce a bias on jointly estimated physical parameters [1].

We propose double machine learning (DML) to solve these problems [2]. DML is a theoretically well-founded technique for fitting semi-parametric models, i.e., models consisting of a parametric and a non-parametric component. DML is widely used for debiased treatment effect estimation in economics. We showcase its use for geosciences on two problems related to carbon dioxide fluxes:

Flux partitioning, which aims at separating the net carbon flux (NEE) into its main contributing gross fluxes, namely, RECO and GPP.
Estimation of the temperature sensitivity parameter of ecosystem respiration Q10.

First, we show that in the case of synthetic data for Q10 estimation, we can consistently retrieve the true value of Q10 where the naive neural network approach fails. We further apply DML to the carbon flux partitioning problem and find that it is 1) able to retrieve the true fluxes of synthetic data, even in the presence of strong (and more realistic) heteroscedastic noise, 2) retrieves main gross carbon fluxes on real data consistent with established methods, and 3) allows us to causally interpret the retrieved GPP as the direct effect of the photosynthetically active radiation on NEE. This way, the DML approach can be seen as a causally interpretable, semi-parametric version of the established daytime methods. We also investigate the functional relationships inferred with DML and the drivers modulating the obtained light-use efficiency function. In conclusion, DML offers a solid framework to develop hybrid and semiparametric modeling and can be of widespread use in geosciences.

[1] Reichstein, Markus, et al. “Combining system modeling and machine learning into hybrid ecosystem modeling.” Knowledge-Guided Machine Learning (2022). https://doi.org/10.1201/9781003143376-14

[2] Chernozhukov, Victor, et al. “Double/debiased machine learning for treatment and structural parameters.” The Econometrics Journal, Volume 21, Issue 1, 1 (2018): C1–C68. https://doi.org/10.1111/ectj.12097

How to cite: Cohrs, K.-H., Varando, G., Reichstein, M., and Camps-Valls, G.: Double machine learning for geosciences, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13462, https://doi.org/10.5194/egusphere-egu23-13462, 2023.

EGU23-13622 | ECS | Posters on site | ITS1.14/CL5.8

Towards explainable marine heatwaves forecasts

Ayush Prasad and Swarnalee Mazumder

In recent years, both the intensity and extent of marine heatwaves have increased across the world. Anomalies in sea surface temperature have an effect on the health of marine ecosystems, which are crucial to the Earth's climate system. Marine Heatwaves' devastating impacts on aquatic life have been increasing steadily in recent years, harming aquatic ecosystems and causing a tremendous loss of marine life. Early warning systems and operational forecasting that can foresee such events can aid in designing effective and better mitigation techniques. Recent studies have shown that machine learning and deep learning-based approaches can be used for forecasting the occurrence of marine heatwaves up to a year in advance. However, these models are black box in nature and do not provide an understanding of the factors influencing MHWs. In this study, we used machine learning methods to forecast marine heatwaves. The developed models were tested across four historical Marine Heatwave events around the world. Explainable AI methods were then used to understand and analyze the relationships between the drivers of these events.

How to cite: Prasad, A. and Mazumder, S.: Towards explainable marine heatwaves forecasts, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13622, https://doi.org/10.5194/egusphere-egu23-13622, 2023.

EGU23-14493 | ECS | Orals | ITS1.14/CL5.8

Interpretable probabilistic forecast of extreme heat waves

Alessandro Lovo, Corentin Herbert, and Freddy Bouchet

Understanding and predicting extreme events is one of the major challenges for the study of climate change impacts, risk assessment, adaptation, and the protection of living beings. Extreme heatwaves are, and likely will be in the future, among the deadliest weather events. They also increase strain on water resources, food security and energy supply. Developing the ability to forecast their probability of occurrence a few days, weeks, or even months in advance would have major consequences to reduce our vulnerability to these events. Beyond the practical benefits of forecasting heat waves, building statistical models for extreme events which are interpretable is also highly beneficial from a fundamental point of view. Indeed, they enable proper studies of the processes underlying extreme events such as heat waves, improve dataset or model validation, and contribute to attribution studies. Machine learning provides tools to reach both these goals.

We will first demonstrate that deep neural networks can predict the probability of occurrence of long-lasting 14-day heatwaves over France, up to 15 days ahead of time for fast dynamical drivers (500 hPa geopotential height field), and at much longer lead times for slow physical drivers (soil moisture). Those results are amazing in terms of forecasting skill. However, these machine learning models tend to be very complex and are often treated as black boxes. This limits our ability to use them for investigating the dynamics of extreme heat waves.

To gain physical understanding, we have then designed a network architecture which is intrinsically interpretable. The main idea of this architecture is that the network first computes an optimal index, which is an optimal projection of the physical fields in a low-dimensional space. In a second step, it uses a fully non-linear representation of the probability of occurrence of the event as a function of the optimal index. This optimal index can be visualized and compared with classical heuristic understanding of the physical process, for instance in terms of geopotential height and soil moisture. This fully interpretable network is slightly less efficient than the off-the-shelf deep neural network. We fully quantify the performance loss incurred when requiring interpretability and make the connection with the mathematical notion of committor functions.

This new machine learning tool opens the way for understanding optimal predictors of weather and climate extremes. This has potential for the study of slow drivers, and the effect of climate change on the drivers of extreme events.

How to cite: Lovo, A., Herbert, C., and Bouchet, F.: Interpretable probabilistic forecast of extreme heat waves, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14493, https://doi.org/10.5194/egusphere-egu23-14493, 2023.

EGU23-14856 | ECS | Orals | ITS1.14/CL5.8

Classification of Indoor Air Pollution Using Low-cost Sensors by Machine Learning

Andrii Antonenko, Viacheslav Boretskij, and Oleksandr Zagaria

Air pollution has become an integral part of modern life. The main source of air pollution can be considered combustion processes associated with energy-intensive corporate activities. Energy companies consume about one-third of the fuel produced and are a significant source of air pollution [1]. State and public air quality monitoring networks were created to monitor the situation. Public monitoring networks are cheaper and have more coverage than government ones. Although the state monitoring system shows more accurate data, an inexpensive network is sufficient to inform the public about the presence or absence of pollution (air quality). In order to inform the public, the idea arose to test the possibility of detecting types of pollution using data from cheap air quality monitoring sensors. In general, to use a cheap sensor for measurements, it must first be calibrated (corrected) by comparing its readings with a reference device. Various mathematical methods can be used for this. One of such method is neural network training, which has proven itself well for correcting PM particle readings due to relative humidity impact [2].

The idea of using a neural network to improve data quality is not new, but it is quite promising, as the authors showed in [3]. The main problem to implement this method is connected with a reliable dataset for training the network. For this, it is necessary to register sensor readings for relatively clean air and for artificially generated or known sources of pollution. Training the neural network on the basis of collected data can be used to determine (classify) types of air: with pollution (pollutant) or without. For this, an experiment was set up in the "ReLab" co-working space at the Taras Shevchenko National University of Kyiv. The sensors were placed in a closed box, in which airflow ventilation is provided. The ZPHS01B [4] sensor module was used for inbox measurements, as well as, calibrated sensors PMS7003 [5] and BME280 [6]. Additionally, IPS 7100 [7] and SPS30 [8] were added to enrich the database for ML training. A platform based on HiLink 7688 was used for data collecting, processing, and transmission.

Data was measured every two seconds, independently from each sensor. Before each experiment, the room was ventilated to avoid influence on the next series of experiments.

References

1. Zaporozhets A. Analysis of means for monitoring air pollution in the environment. Science-based technologies. 2017, Vol. 35, no3. 242-252. DOI: 10.18372/2310-5461.35.11844

2. Antonenko A, (2021) Correction of fine particle concentration readings depending on relative humidity, [Master's thesis, Taras Shevchenko National University of Kyiv], 35 pp.

3. Lee, J. Kang, S. Kim, Y. Im, S. Yoo , D. Lee, “Long-Term Evaluation and Calibration of Low-Cost Particulate Matter (PM) Sensor”, Sensors 2020, vol. 20, 3617, 24 pp., 2020.`

4. ZPHS01B Datasheet URL: https://pdf1.alldatasheet.com/datasheet-pdf/view/1303697/WINSEN/ZPHS01B.html

5. Plantower PMS7003 Datasheet URL: https://www.espruino.com/datasheets/PMS7003.pdf

6. Bosch 280 Datasheet URL: https://www.mouser.com/datasheet/2/783/BST-BME280-DS002-1509607.pdf

7. https://pierasystems.com/intelligent-particle-sensors/

8. https://sensirion.com/products/catalog/SPS30/

How to cite: Antonenko, A., Boretskij, V., and Zagaria, O.: Classification of Indoor Air Pollution Using Low-cost Sensors by Machine Learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14856, https://doi.org/10.5194/egusphere-egu23-14856, 2023.

EGU23-15000 | ECS | Orals | ITS1.14/CL5.8 | Highlight

Causal inference to study food insecurity in Africa

Jordi Cerdà-Bautista, José María Tárraga, Gherardo Varando, Alberto Arribas, Ted Shepherd, and Gustau Camps-Valls

The current situation regarding food insecurity in the continent of Africa, and the Horn of Africa in particular, is at an unprecedented risk level triggered by continuous drought events, complicated interactions between food prices, crop yield, energy inflation and lack of humanitarian aid, along with disrupting conflicts and migration flows. The study of a food-secure environment is a complex, multivariate, multiscale, and non-linear problem difficult to understand with canonical data science methodologies. We propose an alternative approach to the food insecurity problem from a causal inference standpoint to discover the causal relations and evaluate the likelihood and potential consequences of specific interventions. In particular, we demonstrate the use of causal inference for understanding the impact of humanitarian interventions on food insecurity in Somalia. In the first stage of the problem, we apply different data transformations to the main drivers to achieve the highest degree of correlation with the interested variable. In the second stage, we infer causation from the main drivers and interested variables by applying different causal methods such as PCMCI or Granger causality. We analyze and harmonize different time series, per district of Somalia, of the global acute malnutrition (GAM) index, food market prices, crop production, conflict levels, drought and flood internal displacements, as well as climate indicators such as the NDVI index, precipitation or land surface temperature. Then, assuming a causal graph between the main drivers causing the food insecurity problem, we estimate the effect of increasing humanitarian interventions on the GAM index, considering the effects of a changing climate, migration flows, and conflict events. We show that causal estimation with modern methodologies allows us to quantify the impact of humanitarian aid on food insecurity.

References

[1] Runge, J., Bathiany, S., Bollt, E. et al. Inferring causation from time series in Earth system sciences. Nat Commun 10, 2553 (2019). https://doi.org/10.1038/s41467-019-10105-3

[2] Sazib Nazmus, Mladenova lliana E., Bolten John D., Assessing the Impact of ENSO on Agriculture Over Africa Using Earth Observation Data, Frontiers in Sustainable Food Systems, 2020, 10.3389/fsufs.2020.509914. https://www.frontiersin.org/article/10.3389/fsufs.2020.509914

[3] Checchi, F., Frison, S., Warsame, A. et al. Can we predict the burden of acute malnutrition in crisis-affected countries? Findings from Somalia and South Sudan. BMC Nutr 8, 92 (2022). https://doi.org/10.1186/s40795-022-00563-2

How to cite: Cerdà-Bautista, J., Tárraga, J. M., Varando, G., Arribas, A., Shepherd, T., and Camps-Valls, G.: Causal inference to study food insecurity in Africa, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15000, https://doi.org/10.5194/egusphere-egu23-15000, 2023.

EGU23-15185 | ECS | Posters on site | ITS1.14/CL5.8

Deep learning to support ocean data quality control

Mohamed Chouai, Felix Simon Reimers, and Sebastian Mieruch-Schnülle

In this study, which is part of the M-VRE [https://mosaic-vre.org/about] project, we aim to improve a quality control (QC) system on arctic ocean temperature profile data using deep learning. For the training, validation, and evaluation of our algorithms, we are using the UDASH dataset [https://essd.copernicus.org/articles/10/1119/2018/]. In the classical QC setting, the ocean expert or "operator", applies a series of thresholding (classical) algorithms to identify, i.e. flag, erroneous data. In the next step, the operator visually inspects every data profile, where suspicious samples have been identified. The goal of this time-consuming visual QC is to find "false positives", i.e. flagged data that is actually good, because every sample/profile has not only a scientific value but also a monetary one. Finally, the operator turns all "false positive" data back to good. The crucial point here is that although these samples/profiles are above certain thresholds they are considered good by the ocean expert. These human expert decisions are extremely difficult, if not impossible, to map by classical algorithms. However, deep-learning neural networks have the potential to learn complex human behavior. Therefore, we have trained a deep learning system to "learn" exactly the expert behavior of finding "false positives" (identified by the classic thresholds), which can be turned back to good accordingly. The first results are promising. In a fully automated setting, deep learning improves the results and fewer data are flagged. In a subsequent visual QC setting, deep learning relieves the expert with a distinct workload reduction and gives the option to clearly increase the quality of the data.
Our long-term goal is to develop an arctic quality control system as a series of web services and Jupyter notebooks to apply automated and visual QC online, efficient, consistent, reproducible, and interactively.

How to cite: Chouai, M., Simon Reimers, F., and Mieruch-Schnülle, S.: Deep learning to support ocean data quality control, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15185, https://doi.org/10.5194/egusphere-egu23-15185, 2023.

EGU23-15286 | ECS | Orals | ITS1.14/CL5.8

Spatio-temporal downscaling of precipitation data using a conditional generative adversarial network

Luca Glawion, Julius Polz, Benjamin Fersch, Harald Kunstmann, and Christian Chwala

Natural disasters caused by cyclones, hail, landslides or floods are directly related to precipitation. Global climate models are an important tool to adapt to these hazards in a future climate. However, they operate on spatial and temporal discretizations that limit the ability to adequately reflect these fast evolving, highly localized phenomena which has led to the development of various downscaling approaches .

Conditional generative adversarial networks (cGAN) have recently been applied as a promising downscaling technique to improve the spatial resolution of climate data. The ability of GANs to generate ensembles of solutions from random perturbations can be used to account for the stochasticity of climate data and quantify uncertainties.

We present a cGAN for not only downscaling the spatial, but simultaneously also the temporal dimension of precipitation data as a so-called video super resolution approach. 3D convolutional layers are exploited for extracting and generating temporally consistent rain events with realistic fine-scale structure. We downscale coarsened gauge adjusted and climatology corrected precipitation data from Germany from a spatial resolution of 32 km to 2 km and a temporal resolution of 1 hr to 10 min, by applying a novel training routine using partly normalized and logarithmized data, allowing for improved extreme value statistics of the generated fields.

Exploiting the fully convolutional nature of our model we can generate downscaled maps for the whole of Germany in a single downscaling step at low latency. The evaluation of these maps using a spatial and temporal power spectrum analysis shows that the generated temporal and spatial structures are in high agreement with the reference. Visually, the generated temporally evolving and advecting rain events are hardly classifiable as artificial generated. The model also shows high skill regarding pixel-wise error and localization of high precipitation intensities, considering the FSS, CRPS, KS and RMSE. Due to the underdetermined downscaling problem a probabilistic cGAN approach yields additional information to deterministic models which we use for comparison. The method is also capable of preserving the climatology, e.g., expressed as the annual precipitation sum. Investigations of temporal aggregations of the downscaled fields revealed an interesting effect. We observe that structures generated in networks with convolutional layers are not placed completely at random, but can generate recurrent structures, which can also be discovered within other prominent DL downscaling models. Although they can be mitigated by adequate model selection, their occurrence remains an open research question.

We conclude that our proposed approach can extend the application of cGANs for downscaling to the time dimension and therefore is a promising candidate to supplement conventional downscaling methods due to the high performance and computational efficiency.

How to cite: Glawion, L., Polz, J., Fersch, B., Kunstmann, H., and Chwala, C.: Spatio-temporal downscaling of precipitation data using a conditional generative adversarial network, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15286, https://doi.org/10.5194/egusphere-egu23-15286, 2023.

EGU23-15540 | ECS | Posters on site | ITS1.14/CL5.8 | Highlight

USCC: A Benchmark Dataset for Crop Yield Prediction under Climate Extremes

Adrian Höhl, Stella Ofori-Ampofo, Ivica Obadic, Miguel-Ángel Fernández-Torres, Ridvan Salih Kuzu, and Xiaoxiang Zhu

Climate variability and extremes are known to represent major causes for crop yield anomalies. They can lead to the reduction of crop productivity, which results in disruptions in food availability and nutritional quality, as well as in rising food prices. Extreme climates will become even more severe as global warming proceeds, challenging the achievement of food security. These extreme events, especially droughts and heat waves, are already evident in major food-production regions like the United States. Crops cultivated in this country such as corn and soybean are critical for both domestic use and international supply. Considering the sensitivity of crops to climate, here we present a dataset that couples remote sensing surface reflectances with climate variables (e.g. minimum and maximum temperature, precipitation, and vapor pressure) and extreme indicators. The dataset contains the crop yields of various commodities over the USA for nearly two decades. Given the advances and proven success of machine learning in numerous remote sensing tasks, our dataset constitutes a benchmark to advance the development of novel models for crop yield prediction, and to analyze the relationship between climate and crop yields for gaining scientific insights. Other potential use cases include extreme event detection and climate forecasting from satellite imagery. As a starting point, we evaluate the performance of several state-of-the-art machine and deep learning models to form a baseline for our benchmark dataset.

How to cite: Höhl, A., Ofori-Ampofo, S., Obadic, I., Fernández-Torres, M.-Á., Salih Kuzu, R., and Zhu, X.: USCC: A Benchmark Dataset for Crop Yield Prediction under Climate Extremes, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15540, https://doi.org/10.5194/egusphere-egu23-15540, 2023.

EGU23-15817 | ECS | Posters on site | ITS1.14/CL5.8

Evaluating the generalization ability of a deep learning model trained to detect cloud-to-ground lightning on raw ERA5 data

Gregor Ehrensperger, Tobias Hell, Georg Johann Mayr, and Thorsten Simon

Atmospheric conditions that are typical for lightning are commonly represented by proxies such as cloud top height, cloud ice flux, CAPE times precipitation, or the lightning potential index. While these proxies generally deliver reasonable results, they often need to be adapted for local conditions in order to perform well. This suggests that there is a need for more complex and holistic proxies. Recent research confirms that the use of machine learning (ML) approaches for describing lightning is promising.

In a previous study a deep learning model was trained on single spatiotemporal (30km x 30km x 1h) cells in the summer period of the years 2010--2018 and showed good results for the unseen test year 2019 within Austria. We now improve this model by using multiple neighboring vertical atmospheric columns to also address for horizontal moisture advection. Furthermore data of successive hours is used as input data to enable the model to capture the temporal development of atmospheric conditions such as the build-up and breakdown of convections.

In this work we focus on the summer months June to August and use data from parts of Central Europe. This spatial domain is thought to be representative for Continental Europe since it covers mountainous aswell as coastal regions. We take raw ERA5 parameters beyond the tropopause enriched with a small amount of meta data such as the day of the year and the hour of the day for training. The quality of the resulting paramaterized model is then evaluated on Continental Europe to examine the generalization ability.

Using parts of Central Europe to train the model, we evaluate its ability to generalize on unseen parts of Continental Europe using EUCLID data. Having a model that generalizes well is a building block for a retrospective analysis back into years where the structured recording of accurate lightning observations in a unified way was not established yet.

How to cite: Ehrensperger, G., Hell, T., Mayr, G. J., and Simon, T.: Evaluating the generalization ability of a deep learning model trained to detect cloud-to-ground lightning on raw ERA5 data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15817, https://doi.org/10.5194/egusphere-egu23-15817, 2023.

EGU23-16098 | Posters on site | ITS1.14/CL5.8

Identifying Lightning Processes in ERA5 Soundings with Deep Learning

Tobias Hell, Gregor Ehrensperger, Georg J. Mayr, and Thorsten Simon

Atmospheric environments favorable for lightning and convection are commonly represented by proxies or parameterizations based on expert knowledge such as CAPE, wind shears, charge separation, or combinations thereof. Recent developments in the field of machine learning, high resolution reanalyses, and accurate lightning observations open possibilities for identifying tailored proxies without prior expert knowledge. To identify vertical profiles favorable for lightning, a deep neural network links ERA5 vertical profiles of cloud physics, mass field variables and wind to lightning location data from the Austrian Lightning Detection & Information System (ALDIS), which has been transformed to a binary target variable labelling the ERA5 cells as lightning and no lightning cells. The ERA5 parameters are taken on model levels beyond the tropopause forming an input layer of approx. 670 features. The data of 2010 - 2018 serve as training/validation. On independent test data, 2019, the deep network outperforms a reference with features based on meteorological expertise. Shapley values highlight the atmospheric processes learned by the network which identifies cloud ice and snow content in the upper and mid-troposphere as relevant features. As these patterns correspond to the separation of charge in thunderstorm cloud, the deep learning model can serve as physically meaningful description of lightning.

How to cite: Hell, T., Ehrensperger, G., Mayr, G. J., and Simon, T.: Identifying Lightning Processes in ERA5 Soundings with Deep Learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16098, https://doi.org/10.5194/egusphere-egu23-16098, 2023.

EGU23-16163 | ECS | Posters on site | ITS1.14/CL5.8

A comparison of methods for determining the number of classes in unsupervised classification of climate models

Emma Boland, Dani Jones, and Erin Atkinson

Unsupervised classification is becoming an increasingly common method to objectively identify coherent structures within both observed and modelled climate data. However, the user must choose the number of classes to fit in advance. Typically, a combination of statistical methods and expertise is used to choose the appropriate number of classes for a given study, however it may not be possible to identify a single ‘optimal’ number of classes. In this
work we present a heuristic method for determining the number of classes unambiguously for modelled data where more than one ensemble member is available. This method requires robustness in the class definition between simulated ensembles of the system of interest. For demonstration, we apply this to the clustering of Southern Ocean potential temperatures in a CMIP6 climate model, and compare with other common criteria such as Bayesian Information Criterion (BIC) and the Silhouette Score.

How to cite: Boland, E., Jones, D., and Atkinson, E.: A comparison of methods for determining the number of classes in unsupervised classification of climate models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16163, https://doi.org/10.5194/egusphere-egu23-16163, 2023.

EGU23-16186 | ECS | Posters on site | ITS1.14/CL5.8

A review of deep learning for weather prediction

Jannik Thümmel, Martin Butz, and Bedartha Goswami

Recent years have seen substantial performance-improvements of deep-learning-based
weather prediction models (DLWPs). These models cover a large range of temporal and
spatial resolutions—from nowcasting to seasonal forecasting and on scales ranging from
single to hundreds of kilometers. DLWPs also exhibit a wide variety of neural architec-
tures and training schemes, with no clear consensus on best practices. Focusing on the
short-to-mid-term forecasting ranges, we review several recent, best-performing models
with respect to critical design choices. We emphasize the importance of self-organizing
latent representations and inductive biases in DLWPs: While NWPs are designed to sim-
ulate resolvable physical processes and integrate unresolvable subgrid-scale processes by
approximate parameterizations, DLWPs allow the latent representation of both kinds of
dynamics. The purpose of this review is to facilitate targeted research developments and
understanding of how design choices influence performance of DLWPs. While there is
no single best model, we highlight promising avenues towards accurate spatio-temporal
modeling, probabilistic forecasts and computationally efficient training and infer

How to cite: Thümmel, J., Butz, M., and Goswami, B.: A review of deep learning for weather prediction, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16186, https://doi.org/10.5194/egusphere-egu23-16186, 2023.

EGU23-16443 | ECS | Orals | ITS1.14/CL5.8

Hybrid machine learning model of coupled carbon and water cycles

Zavud Baghirov, Basil Kraft, Martin Jung, Marco Körner, and Markus Reichstein

There is evidence for a strong coupling between the terrestrial carbon and water cycles and that these cycles should be studied as an interconnected system (Humphrey et al. 2018). One of the key methods to numerically represent the Earth system is process based modelling, which is, however, still subject to large uncertainties, e.g., due to wrong or incomplete process knowledge (Bonan and Doney 2018). Such models are often rigid and only marginally informed by Earth observations. This is where machine learning (ML) approaches can be advantageous, due to their ability to learn from data in a flexible way. These methods have their own shortcomings, such as their “black-box” nature and lack of physical consistency.

Recently, it has been suggested by Reichstein et al. (2019) to combine process knowledge with ML algorithms to model environmental processes. The so-called hybrid modelling approach has already been used to model different components of terrestrial water storage (TWS) in a global hydrological model (Kraft et al. 2022). This study follows-up on this work with the objective to improve the parameterization of some processes (e.g., soil moisture) and to couple the model with the carbon cycle. The coupling could potentially reduce model uncertainties and help to better understand water-carbon interactions.

The proposed hybrid model of the coupled water and carbon cycles is forced with reanalysis data from ERA-5, such as air temperature, net radiation, and CO₂concentration from CAMS. Water-carbon cycle processes are constrained using observational data products of water-carbon cycles. The hybrid model uses a long short-term memory (LSTM) model—a member of the recurrent neural networks family—at its core for processing the time-series Earth observation data. The LSTM simulates a number of coefficients which are used as parameters in the conceptual model of water and carbon cycles. Some of the key processes represented in the conceptual model are evapotranspiration, snow, soil moisture, runoff, groundwater, water use eﬀiciency (WUE), ecosystem respiration, and net ecosystem exchange. The model partitions TWS into different components and it can be used to assess the impact of different TWS components on the CO₂ growth rate. Moreover, we can assess the learned system behaviors of water and carbon cycle interactions for different ecosystems.

References:

Bonan, Gordon B, and Scott C Doney. 2018. “Climate, Ecosystems, and Planetary Futures: The Challenge to Predict Life in Earth System Models.” Science 359 (6375): eaam8328.

Humphrey, Vincent, Jakob Zscheischler, Philippe Ciais, Lukas Gudmundsson, Stephen Sitch, and Sonia I Seneviratne. 2018. “Sensitivity of Atmospheric CO₂ Growth Rate to Observed Changes in Terrestrial Water Storage.” Nature 560 (7720): 628–31.

Kraft, Basil, Martin Jung, Marco Körner, Sujan Koirala, and Markus Reichstein. 2022. “Towards Hybrid Modeling of the Global Hydrological Cycle.” Hydrology and Earth System Sciences 26 (6): 1579–1614.

Reichstein, Markus, Gustau Camps-Valls, Bjorn Stevens, Martin Jung, Joachim Denzler, Nuno Carvalhais, et al. 2019. “Deep Learning and Process Understanding for Data-Driven Earth System Science.” Nature 566 (7743): 195–204.

How to cite: Baghirov, Z., Kraft, B., Jung, M., Körner, M., and Reichstein, M.: Hybrid machine learning model of coupled carbon and water cycles, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16443, https://doi.org/10.5194/egusphere-egu23-16443, 2023.

EGU23-16449 | Orals | ITS1.14/CL5.8

Data-driven seasonal forecasts of European heat waves

Stefano Materia, Martin Jung, Markus G. Donat, and Carlos Gomez-Gonzalez

Seasonal Forecasts are critical tools for early-warning decision support systems, that can help reduce the related risk associated with hot or cold weather and other events that can strongly affect a multitude of socio-economic sectors. Recent advances in both statistical approaches and numerical modeling have improved the skill of Seasonal Forecasts. However, especially in mid-latitudes, they are still affected by large uncertainties that can limit their usefulness.

The MSCA-H2020 project ARTIST aims at improving our knowledge of climate predictability at the seasonal time-scale, focusing on the role of unexplored drivers, to finally enhance the performance of current prediction systems. This effort is meant to reduce uncertainties and make forecasts efficiently usable by regional meteorological services and private bodies. This study focuses on seasonal prediction of heat extremes in Europe, and here we present a first attempt to predict heat wave accumulated activity across different target seasons. An empirical seasonal forecast is designed based on Machine Learning techniques. A feature selection approach is used to detect the best subset of predictors among a variety of candidates, and then an assessment of the relative importance of each predictor is done, in different European regions for the four main seasons.

Results show that many observed teleconnections are caught by the data-driven approach, while a few features that show to be linked to the heat wave propensity of a season deserve a deeper understanding of the underpinning physical process.

How to cite: Materia, S., Jung, M., Donat, M. G., and Gomez-Gonzalez, C.: Data-driven seasonal forecasts of European heat waves, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16449, https://doi.org/10.5194/egusphere-egu23-16449, 2023.

EGU23-16846 | ECS | Orals | ITS1.14/CL5.8

Learning causal drivers of PyroCb

Emiliano Díaz, Gherardo Varando, Fernando Iglesias-Suarez, Gustau Camps-Valls, Kenza Tazi, Kara Lamb, and Duncan Watson-Parris

Discovering causal relationships from purely observational data is often not possible. In this case, combining observational and experimental data can allow for the identifiability of the underlying causal structure. In Earth Systems sciences, carrying out interventional experiments is often impossible for ethical and practical reasons. However, “natural interventions”, are often present in the data, and these represent regime changes caused by changes to exogenous drivers. In [3,4], the Invariant Causal Prediction (ICP) methodology was presented to identify the causes of a target variable of interest from a set of candidate causes. This methodology takes advantage of natural interventions, resulting in different cause variables distributions across different environments. In [2] this methodology is implemented in a geoscience problem, namely identifying the causes of Pyrocumulunimbus (pyroCb), and storm clouds resulting from extreme wildfires. Although a set of plausible causes is produced, certain heuristic adaptations to the original ICP methodology were implemented to overcome some of the practical. limitations of ICP: a large number of hypothesis tests required and a failure to identify causes when these have a high degree of interdependence. In this work, we try to circumvent these difficulties by taking a different approach. We use a learning paradigm similar to that presented in [3] to learn causal representations invariant across different environments. Since we often don’t know exactly how to define the different environments best, we also propose to learn functions that describe their spatiotemporal extent. We apply the resulting algorithm to the pyroCb database in [1] and other Earth System sciences datasets to verify the plausibility of the causal representations found and the environments that describe the so-called natural interventions..

[1] Tazi et al. 2022. https://arxiv.org/abs/2211.13052

[2] Díaz et al. 2022 .https://arxiv.org/abs/2211.08883

[3] Arjovsky et al. 2019. https://arxiv.org/abs/1907.02893

[4] Peters et al.2016. https://www.jstor.org/stable/4482904

[5] Heinze-Deml et al. 2018. https://doi.org/10.1515/jci-2017-0016

How to cite: Díaz, E., Varando, G., Iglesias-Suarez, F., Camps-Valls, G., Tazi, K., Lamb, K., and Watson-Parris, D.: Learning causal drivers of PyroCb, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16846, https://doi.org/10.5194/egusphere-egu23-16846, 2023.

EGU23-17082 | ECS | Posters on site | ITS1.14/CL5.8

A statistical approach on rapid estimations of climate change indices by monthly instead of daily data

Kristofer Hasel, Marianne Bügelmayer-Blaschek, and Herbert Formayer

Climate change indices (CCI) defined by the expert team on climate change detection and indices (ETCCDI) profoundly contribute to understanding climate and its change. They are used to present climate change in an easy to understand and tangible way, thus facilitating climate communication. Many of the indices are peak over threshold indices needing daily and, if necessary, bias corrected data to be calculated from. We present a method to rapidly estimate specific CCI from monthly data instead of daily while also performing a simple bias correction as well as a localisation (downscaling). Therefore, we used the ERA5 Land data with a spatial resolution of 0.1° supplemented by a CMIP6 ssp5-8.5 climate projection to derive different regression functions which allow a rapid estimation by monthly data. Using a climate projection as a supplement in training the regression functions allows an application not only on historical periods but also on future periods such as those provided by climate projections. Nevertheless, the presented method can be adapted to any data set, allowing an even higher spatial resolution.

How to cite: Hasel, K., Bügelmayer-Blaschek, M., and Formayer, H.: A statistical approach on rapid estimations of climate change indices by monthly instead of daily data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17082, https://doi.org/10.5194/egusphere-egu23-17082, 2023.

EGU23-17197 | Posters on site | ITS1.14/CL5.8

Machine learning workflow for deriving regional geoclimatic clusters from high-dimensional data

Sebastian Lehner, Katharina Enigl, and Matthias Schlögl

Geoclimatic regions represent climatic forcing zones, which constitute important spatial entities that serve as a basis for a broad range of analyses in earth system sciences. The plethora of geospatial variables that are relevant for obtaining consistent clusters represent a high-dimensionality, especially when working with high-resolution gridded data, which may render the derivation of such regions complex. This is worsened by typical characteristics of geoclimatic data like multicollinearity, nonlinear effects and potentially complex interactions between features. We therefore present a nonparametric machine learning workflow, consisting of dimensionality reduction and clustering for deriving geospatial clusters of similar geoclimatic characteristics. We demonstrate the applicability of the proposed procedure using a comprehensive dataset featuring climatological and geomorphometric data from Austria, aggregated to the recent climatological normal from 1992 to 2021.

The modelling workflow consists of three major sequential steps: (1) linear dimensionality reduction using Principal Component Analysis, yielding a reduced, orthogonal sub-space, (2) nonlinear dimensionality reduction applied to the reduced sub-space using Uniform Manifold Approximation and Projection, and (3) clustering the learned manifold projection via Hierarchical Density-Based Spatial Clustering of Applications with Noise. The contribution of the input features to the cluster result is then assessed by means of permutation feature importance of random forest models. These are trained by treating the clustering result as a supervised classification problem. Results show the flexibility of the defined workflow and exhibit good agreement with both quantitatively derived and synoptically informed characterizations of geoclimatic regions from other studies. However, this flexibility does entail certain challenges with respect to hyperparameter settings, which require careful exploration and tuning. The proposed workflow may serve as a blueprint for deriving consistent geospatial clusters exhibiting similar geoclimatic attributes.

How to cite: Lehner, S., Enigl, K., and Schlögl, M.: Machine learning workflow for deriving regional geoclimatic clusters from high-dimensional data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17197, https://doi.org/10.5194/egusphere-egu23-17197, 2023.

EGU23-17333 | ECS | Posters on site | ITS1.14/CL5.8

Emulating the regional temperature responses (RTPs) of short-lived climate forcers

Maura Dewey, Hans Christen Hansson, and Annica M. L. Ekman

Here we develop a statistical model emulating the surface temperature response to changes in emissions of short-lived climate forcers as simulated by an Earth system model. Short-lived climate forcers (SLCFs) are chemical components in the atmosphere that interact with radiation and have both an immediate effect on local air quality, and regional and global effects on the climate in terms of changes in temperature and precipitation distributions. The short atmospheric residence times of SLCFs lead to high atmospheric concentrations in emission regions and a highly variable radiative forcing pattern. Regional Temperature Potentials (RTPs) are metrics which quantify the impact of emission changes in a given region on the temperature or forcing response of another, accounting for spatial inhomogeneities in both forcing and the temperature response, while being easy to compare across models and to use in integrated assessment studies or policy briefs. We have developed a Gaussian-process emulator using output from the Norwegian Earth System Model (NorESM) to predict the temperature responses to regional emission changes in SLCFs (specifically back carbon, organic carbon, sulfur dioxide, and methane) and use this model to calculate regional RTPs and study the sensitivity of surface temperature in a certain region, e.g. the Arctic, to anthropogenic emission changes in key policy regions. The main challenge in developing the emulator was creating the training data set such that we included maximal SLCF variability in a realistic and policy relevant range compared to future emission scenarios, while also getting a significant temperature response. We also had to account for the confounding influence of greenhouse gases (GHG), which may not follow the same future emission trajectories as SLCFs and can overwhelm the more subtle temperature response that comes from the direct and indirect effects of SLCF emissions. The emulator can potentially provide accurate and customizable predictions for policy makers to proposed emission changes with minimized climate impact.

How to cite: Dewey, M., Hansson, H. C., and Ekman, A. M. L.: Emulating the regional temperature responses (RTPs) of short-lived climate forcers, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17333, https://doi.org/10.5194/egusphere-egu23-17333, 2023.

EGU23-542 | ECS | Posters on site | ESSI1.3

Electron Temperature Inference from Fixed Bias Langmuir Probes Set-Ups in Ionospheric Conditions

Florine Enengl, Sigvald Marholm, Sayan Adhikari, Richard Marchand, and Wojciech J. Miloch

In this work, we show the first achievement of inferring the electron temperature in ionospheric conditions from synthetic data using fixed-bias Langmuir probes operating in the electron saturation region. This was done using machine learning, as well as by altering the probe geometry. The electron temperature is inferred at the same rate as the currents are sampled by the probes. For inferring the electron temperature along with the electron density and the floating potential, a minimum number of three probes is required. Furthermore does one probe geometry need to be distinct from the other two, since otherwise the probe setup may be insensitive to temperature. This can be achieved by having either one shorter probe or a probe of a different geometry, e.g. two longer and a shorter cylindrical probe or two cylindrical probes and a spherical probe. We use synthetic plasma parameter data and calculate the synthetic collected probe currents to train a neural network (using TensorFlow) and verify the results with a test set as well as with data from the International Reference Ionosphere (IRI) model. A table with computed currents collected by a spherical probe by Laframboise was extended to calculate currents of the synthetic plasma parameters for high eta values (eta >25) to cover a large altitude range (100-500 km, within Earth's ionosphere). The extrapolated values were benchmarked with Particle-in-Cell simulations. Finally, we evaluate the robustness and errors of different probe setups that can be used to infer the electron temperature. As the inferred temperatures are compared to results from the International Reference Ionosphere model, we verify the validity of the inferred temperature in altitudes ranging from about 100-500 km. We show that electron temperature inference from different combinations of spherical and cylindrical probes - three cylindrical probes, three spherical probes, four cylindrical and a spherical probe - can be achieved. Even minor changes in the probe sizing enable the temperature inference and result in root mean square relative errors (RMSRE) between inferred and ground truth data of under 3%. With further optimizations, the RMSRE can even be decreased to under 1%. When limiting the temperature inference to 120-450 km altitude an RMSRE of under 0.7% is achieved for all probe setups. In future, the multi-needle Langmuir Probe (m-NLP) instrument dimensions can be adapted for higher temperature inference accuracy.

How to cite: Enengl, F., Marholm, S., Adhikari, S., Marchand, R., and Miloch, W. J.: Electron Temperature Inference from Fixed Bias Langmuir Probes Set-Ups in Ionospheric Conditions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-542, https://doi.org/10.5194/egusphere-egu23-542, 2023.

EGU23-850 | Posters on site | ESSI1.3

Unsupervised learning of active-region nesting on the Sun

Emre Isik, Nurdan Karapinar, and Selim Göktug Cankurtaran

Active-region emergence on the Sun shows a degree of clumpiness in both space and time. At a given time, multiple active regions can be seen in what is called active-region- or sunspot-group nests. This tendency also increases the potential to produce large flares and associated CMEs. In the literature, the nesting tendency of active regions is reported in the range of 30-50 per cent, but no statistically robust and ML-based approaches exist so far. Quantifying the nesting degree along an activity cycle and determining its spatial and temporal scales are important to investigate the processes that cause this phenomenon.

In this study, we estimate the latitudinal and longitudinal extents of active region nesting using both continuum and magnetogram data, using SDO/HMI synoptic magnetograms and Kislovodsk Mountain Astronomical Station (KMAS) sunspot group data. We carry out kernel density estimation (Fig. 1) and unsupervised ML techniques (e.g., DBSCAN and Gaussian mixtures) in spatial and spatio-temporal domains. Our study reveals trends in the emergence characteristics of sunspot groups on the Sun.

Figure 1: Kernel density estimation with a Gaussian kernel on the time-longitude plane. The dot size indicates sunspot group areas in MSH.

How to cite: Isik, E., Karapinar, N., and Cankurtaran, S. G.: Unsupervised learning of active-region nesting on the Sun, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-850, https://doi.org/10.5194/egusphere-egu23-850, 2023.

EGU23-2719 | Posters on site | ESSI1.3

Different types of PCA-NN model for TEC with space weather parameters as predictors: advantages and disadvantages of different NN algorithms

Anna Morozova, Ricardo Gafeira, Teresa Barata, and Tatiana Barlyaeva

EGU23-2756 | ECS | Orals | ESSI1.3

SuNeRF: AI enables 3D reconstruction of the solar EUV corona

Robert Jarolim, Benoit Tremblay, Andres Munoz-Jaramillo, Kyriaki-Margarita Bintsi, Anna Jungbluth, Miraflor Santos, James Paul Mason, Sairam Sundaresan, Cooper Downs, Ronald Caplan, and Angelos Vourlidas

To understand the solar evolution and effects of solar eruptive events, the Sun is permanently observed by multiple satellite missions. The optically-thin emission of the solar plasma and the limited number of viewpoints make it challenging to reconstruct the geometry and structure of the solar atmosphere; however, this information is the missing link to understand the Sun as it is: a three-dimensional, evolving star. We present a method that enables a complete 3D representation of the uppermost solar layer observed in extreme ultraviolet (EUV) light. We use a deep learning approach for 3D scene representation that accounts for radiative transfer, to map the entire solar atmosphere from three simultaneous observations. We demonstrate that our approach provides unprecedented reconstructions of the solar poles, and directly enables height estimates of coronal structures, solar flux ropes, coronal hole profiles, and coronal mass ejections. We validate the approach using model-generated synthetic EUV images, finding that our method accurately captures the 3D geometry even from a limited number of viewpoints. We quantify uncertainties of our model using an ensemble approach that allows us to estimate the model performance in absence of a ground-truth. Our method enables a novel view of our closest star, and is a breakthrough technology for the efficient use of multi-instrument datasets, which paves the way for future cluster missions.

How to cite: Jarolim, R., Tremblay, B., Munoz-Jaramillo, A., Bintsi, K.-M., Jungbluth, A., Santos, M., Mason, J. P., Sundaresan, S., Downs, C., Caplan, R., and Vourlidas, A.: SuNeRF: AI enables 3D reconstruction of the solar EUV corona, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2756, https://doi.org/10.5194/egusphere-egu23-2756, 2023.

EGU23-2897 | Orals | ESSI1.3

Automatic Classification of THEMIS All-Sky Images via Self-Supervised Semi-Supervised Learning

Jeremiah Johnson, Dogacan Ozturk, Hyunju Connor, Donald Hampton, Matthew Blandin, and Amy Keesee

Dynamic interactions between the solar wind and the magnetosphere give rise to dramatic auroral forms that have been instrumental in the ground-based study of magnetospheric dynamics. The general mechanism of aurora types and their large-scale patterns are well-known, but the morphology of small- to meso-scale auroral forms observed in all-sky imagers and their relation to magnetospheric dynamics and the coupling of the magnetosphere to the upper atmosphere remain in question. Machine learning has the potential to provide answers to these questions, but most existing auroral image data lack the ground-truth labels required for supervised learning and conventional statistical analyses. To mitigate this issue, we propose a novel self-supervised semi-supervised algorithm to automatically label the THEMIS all-sky image database. Specifically, we adapt the self-supervised Simple framework for Contrastive Learning of Representations (SimCLR) algorithm to learn latent representations of THEMIS all-sky images. These representations are finetuned using a small set of manually labeled data from the Oslo Aurora THEMIS (OATH) dataset, after which semi-supervised classification is used to train a classifier, beginning by training on the manually labeled OATH dataset and gradually incorporating the classifier’s most confident predictions on unlabeled data into the training dataset as ground-truth. We demonstrate that (a) classifiers fit to the learned representations of the manually labeled images achieve state–of–the–art performance, improving the classification accuracy by almost 10% over the current benchmark on labeled data; and (b) our model’s learned representations naturally cluster into more clusters than manually assigned categories, suggesting that existing categorizations are coarse and may obscure important connections between auroral types and their drivers. Finally, we introduce AuroraClick, a citizen science project with the goal of manually annotating a large representative sample of THEMIS all-sky images for the validation of our current models and the training of future models.

How to cite: Johnson, J., Ozturk, D., Connor, H., Hampton, D., Blandin, M., and Keesee, A.: Automatic Classification of THEMIS All-Sky Images via Self-Supervised Semi-Supervised Learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2897, https://doi.org/10.5194/egusphere-egu23-2897, 2023.

EGU23-3379 | ECS | Posters on site | ESSI1.3

Estimation and Prediction of Solar Wind Propagation from L1 Point to Earth’s Bow Shock

Samira Tasnim, Ying Zou, Claudia Borries, Carsten Baumann, Brian Walsh, Krishna Khanal, Connor O'Brien, and Huaming Zhang

Having precise knowledge of the near-Earth solar wind (SW) and the embedded interplanetary magnetic field (IMF) is of critical importance to space weather operation due to the usage of SW and IMF in almost all magnetospheric and ionospheric models. The most widely used data source, OMNI, propagates SW properties from Lagrangian point L1 to the Earth’s bow shock by estimating the propagation time of the SW. However, the time difference between OMNI timeshifted IMF and the best match-up of IMF can reach ˜15 min. Firstly, we aim to develop an improved statistical algorithm to contribute to the SW propagation delay problem of space weather prediction. The algorithm focuses on matching SW features around the L1 point and upstream of the bow shock by computing the variance, cross-correlation coefficient, the plateau-shaped magnitude index, and the non-dimensional measure of average error index between the measurements at the two locations. The obtained propagation times are then compared to OMNI. Factors that limit the OMNI accuracy are also examined. Secondly, the automatic algorithm allows us to generate large sets of input and target variables using multiple spacecraft pairs at L1 and near-Earth locations to train, validate, and test machine learning models to specify and forecast near-Earth SW conditions. Finally, we offer a machine learning (ML) approach to specify and predict the propagation time from L1 monitors to a given location upstream or at the bow shock and forecast near-Earth SW conditions with the gradient boosting and random forest prediction models in the form of an ensemble of decision trees.

How to cite: Tasnim, S., Zou, Y., Borries, C., Baumann, C., Walsh, B., Khanal, K., O'Brien, C., and Zhang, H.: Estimation and Prediction of Solar Wind Propagation from L1 Point to Earth’s Bow Shock, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3379, https://doi.org/10.5194/egusphere-egu23-3379, 2023.

EGU23-4069 | ECS | Posters on site | ESSI1.3

Plasma-Sheet Bubble Identification Using Muitivariate Time Series Classification

Feng Xuedong and Yang Jian

Abstract: Plasma-sheet bubbles play a major role in the process of magnetotail particle injections. They are defined as fast flows with reduced plasma density or pressure accompanied by magnetic field dipolarization. Typically, we can detect these bubbles from in-situ observations, but subjective uncertainty needs human verification. In this study, we combine three different methods including MINImally RandOm Convolutional KErnel Transform (MINIROCKET), 1D and 2D convolution neural network (CNN) to identify bubbles. The imbalanced training dataset consists of bubble and non-bubble events with a ratio of 1:40 from year 2007 to 2020. The results indicate that the accuracy of the all three models is around 99%, and the precision and recall rates of all three models are above 80% in both the validation and test datasets. The three methods are combined with the intersection set as the minimum set of predictions and the union set as the maximum set. The methods greatly reduce the number of false positives. To identify bubbles in the observations of year 2021, our neural network model is found to be comparably good to the traditional criterial and manual inspections. Using joint machine learning forecasting methods, we can easily and automatically identify bubbles without a priori knowledge like a domain expert.

Keywords: plasma-sheet bubble, multivariate time series classification, sample imbalanced, image identification

How to cite: Xuedong, F. and Jian, Y.: Plasma-Sheet Bubble Identification Using Muitivariate Time Series Classification, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4069, https://doi.org/10.5194/egusphere-egu23-4069, 2023.

EGU23-5254 | Posters on site | ESSI1.3

AI Assisted Data Selection of Laser Altimeter Observations

Oliver Stenzel, Lukas Maes, and Martin Hilchenbach

Laser altimeters create large amounts of data that often have to be preprocessed and checked before further use. The BepiColombo mission to Mercury is set to arrive in December 2025 and observations with the BepiColombo Laser Altimeter (BELA, (Benkhoff et al., 2010; Thomas et al., 2021)) will start during the following spring. These measurements are planned to be used to derive information about the tides of Mercury (Thor et al., 2020). Careful assessment, selection, and filtering on the raw data is needed to extract the small tidal signal. Until the BELA data becomes available artificial data and records from other missions have to be used to study the data selection strategy. We present our work on MESSENGER Laser Altimeter (MLA, (Cavanaugh et al., 2007)) using a convolutional neural network to sort observations on an orbit by orbit basis into different classes. The already existing neural network (Stenzel and Hilchenbach, 2021; Stenzel, Thor and Hilchenbach, 2021) is tuned and a new test data set is created.

Benkhoff, J. et al. (2010) ‘BepiColombo—Comprehensive exploration of Mercury: Mission overview and science goals’, Planetary and Space Science, 58(1), pp. 2–20. Available at: https://doi.org/10.1016/j.pss.2009.09.020.

Cavanaugh, J.F. et al. (2007) ‘The Mercury Laser Altimeter Instrument for the MESSENGER Mission’, Space Science Reviews, 131(1), pp. 451–479. Available at: https://doi.org/10.1007/s11214-007-9273-4.

Stenzel, O. and Hilchenbach, M. (2021) ‘Towards machine learning assisted error identification in orbital laser altimetry for tides derivation’, pp. EPSC2021-688. Available at: https://doi.org/10.5194/espc2021-688.

Stenzel, O., Thor, R. and Hilchenbach, M. (2021) ‘Error identification in orbital laser altimeter data by machine learning’, pp. EGU21-14749. Available at: https://doi.org/10.5194/egusphere-egu21-14749.

Thomas, N. et al. (2021) ‘The BepiColombo Laser Altimeter’, Space Science Reviews, 217(1), p. 25. Available at: https://doi.org/10.1007/s11214-021-00794-y.

Thor, R.N. et al. (2020) ‘Prospects for measuring Mercury’s tidal Love number h2 with the BepiColombo Laser Altimeter’, Astronomy & Astrophysics, 633, p. A85. Available at: https://doi.org/10.1051/0004-6361/201936517.

How to cite: Stenzel, O., Maes, L., and Hilchenbach, M.: AI Assisted Data Selection of Laser Altimeter Observations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5254, https://doi.org/10.5194/egusphere-egu23-5254, 2023.

EGU23-6968 | ECS | Posters on site | ESSI1.3

Forecasting solar wind speed by machine learning based on coronal hole characteristics

Daniel Collin, Stefano Bianco, Guillermo Gallego, and Yuri Shprits

One of the main sources of solar wind disturbances are coronal holes which can be identified in extreme ultra-violet (EUV) images of the Sun. Previous research has shown the connection between coronal holes and an increase of the solar wind speed at Earth. The time lag between the appearance of coronal holes on the visible side of the Sun and its effects on Earth is 2-5 days. In this study, a machine learning model predicting the solar wind speed originating from coronal holes is proposed. It is based on the analysis of solar EUV images. A segmentation algorithm is applied to the images in order to identify coronal holes and derive their characteristics (e.g. area, location). We also present a new method to calculate the geoeffective coronal hole area: Instead of specifying in advance a sector of the solar surface in which the area is measured and a lag time between area measurement and the arrival of the solar wind, the specification of this sector and the corresponding delay are formulated as a mathematical optimization problem and included in the machine learning model. This approach facilitates an improvement of the prediction accuracy and also prolongs the prediction horizon, as the solar wind speed can be predicted up to approximately 5 days in advance of the disturbance. Several machine learning model architectures are explored. We also study how the time evolution can be included in the model.

How to cite: Collin, D., Bianco, S., Gallego, G., and Shprits, Y.: Forecasting solar wind speed by machine learning based on coronal hole characteristics, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6968, https://doi.org/10.5194/egusphere-egu23-6968, 2023.

EGU23-7529 | ECS | Posters on site | ESSI1.3

Landform detection on Mars using image segmentation with a u-net convolutional neural network architecture

Florian Auer-Welsbach, Andreas Windisch, and Giacomo Nodjoumi

The detection and classification of landforms on planetary surfaces is a time-consuming task which deeply relies on expert knowledge. Such a process can be partially automated and optimized in a resource-efficient way using image processing algorithms. By classifying the surface into different landforms, such as volcanic craters, asteroid impacts, dunes, and more, several analyses can be performed, for instance the widely used crater counting age estimation method. In addition, by conducting these analyses, information about the characteristics and properties of a planet can be revealed. One of the major challenges for the implementation of these algorithms is to provide a generalized model. In many cases the generalization error tends to be very large and therefore a satisfactory accuracy on the test data set cannot be accomplished. This prevents reliable evaluation of new unseen data. In this work, a multi-class image segmentation algorithm is presented, which is based on a U-net convolutional neural network architecture. U-nets classify each pixel of a given input image and can thus produce segmentation masks for various landforms. Given that enough labeled data is available, such a classifier can replace manual detection and classification, thereby saving resources by providing a fast method for landform detection.

How to cite: Auer-Welsbach, F., Windisch, A., and Nodjoumi, G.: Landform detection on Mars using image segmentation with a u-net convolutional neural network architecture, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7529, https://doi.org/10.5194/egusphere-egu23-7529, 2023.

EGU23-7761 | ECS | Posters virtual | ESSI1.3

Comparison study on the deep-learning-based detection of Mars craters

Hind AlRiyami, Claus Gebhardt, and Christopher Lee

Deep-learning methods are of interest for the analysis of imagery and digital elevation models from Mars orbiting satellites. They detect various atmosphere and surface characteristics. For instance, these include dust storms and craters [1,2]. We approach this topic by using the deep-learning-based crater detection algorithm DeepMars2 [3,4]. The algorithm is applied to two digital elevation models (DEMs) of the Mars surface. The DEMs are based on the satellite instruments MOLA/MGS (Mars Orbiter Laser Altimeter/Mars Global Surveyor) and HRSC/MEX (High Resolution Stereo Camera/Mars Express) and have different resolution. Crater detection statistics are compared between both DEMs.

[1] Alshehhi, R., Gebhardt, C. Detection of Martian dust storms using mask regional convolutional neural networks. Prog Earth Planet Sci 9, 4 (2022). https://doi.org/10.1186/s40645-021-00464-1

[2] R. Alshehhi and C. Gebhardt, "Automated Geological Landmarks Detection on Mars Using Deep Domain Adaptation From Lunar High-Resolution Satellite Images," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 2274-2283, 2022, doi: 10.1109/JSTARS.2022.3156371.

[3] Lee, C. (2019). Automated crater detection on Mars using deep learning. Planetary and Space Science, 170, 16-28. https://doi.org/10.1016/j.pss.2019.03.008

[4] Lee, C. & Hogan, J. (2021). Automated crater detection with human level performance. Computers & Geosciences, 147, 104645. https://doi.org/10.1016/j.cageo.2020.104645

How to cite: AlRiyami, H., Gebhardt, C., and Lee, C.: Comparison study on the deep-learning-based detection of Mars craters, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7761, https://doi.org/10.5194/egusphere-egu23-7761, 2023.

EGU23-7941 | ECS | Orals | ESSI1.3

Machine learning ensemble models for solar wind speed prediction

Federico Sabbatini and Catia Grimani

Machine learning models trained to reproduce space mission observations are precious resources to fill gaps of missing data in measurement time series or to perform data forecasting within a reasonable uncertainty degree. The latter option is of particular importance for future space missions that will not host instrumentation dedicated to interplanetary medium parameter monitoring. The future LISA mission for low-frequency gravitational wave detection, for instance, will benefit of particle detectors to measure the galactic cosmic-ray integral flux variations and magnetometers that will allow to monitor the passage of large scale magnetic structures through the three LISA spacecraft as part of a diagnostics subsystem. Unfortunately, no instruments dedicated to solar wind speed measurements will be present on board the spacecraft constellation. Moreover, LISA, scheduled to launch in 2035, will trail Earth on the ecliptic at 50 million km distance, far from the orbits of other space missions dedicated to the interplanetary medium monitoring.

Based on precious lessons learned with LISA Pathfinder, the ESA LISA precursor mission, about the correlation between galactic cosmic-ray flux short-term variations and solar wind speed increases, we built a machine learning ensemble model able to reconstruct the solar wind trend only on the basis of contemporaneous and preceding observations of galactic cosmic-ray flux variations. Details about the model creation and performance will be presented, together with a description of the underlying data set, weak predictors and training phase. Advantages and limitations will be discussed, showing that the model performance may be enhanced by providing interplanetary magnetic field intensity observations as additional input data, with the goal of providing the LISA mission with an effective solar wind speed predictive tool.

How to cite: Sabbatini, F. and Grimani, C.: Machine learning ensemble models for solar wind speed prediction, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7941, https://doi.org/10.5194/egusphere-egu23-7941, 2023.

EGU23-8430 | Orals | ESSI1.3

Modelling Jupiter's global and regional magnetic fields using physics-informed neural networks

Longwei Chen, Phil Livermore, Leyuan Wu, Sjoerd de Ridder, and Chong Zhang

As is known, neural networks can universally approximate any complex functions. This ground truth naturally makes it a suitable candidate for solution representation of complex partial differential equation (PDE) governed. For planetary magnetic field modelling problem, spherical harmonic functions are most used as standard modelling method. Spherical harmonic method requires globally nearly uniformly distributed observations. Meanwhile this method has quite limited ability for conducting regional field modelling. Instead, neural networks have great potential to deal with global or regional modelling problems. In this work, we thoroughly investigate the representative ability of neural networks for magnetic field modelling problem at global and regional scale, and concentrate on a specific neural network, that is physics-informed neural networks (PINNs) for implementation. PINNs makes it easier to incorporate different kinds of informed physics within a uniform optimization framework. Through synthetic model tests and partial mathematical proof, we showcase the importance of employing natural boundary condition, Laplace equation constraint and Poisson equation constraint at suitable collocation points for a reasonable and accurate magnetic field representation and introduce the detailed scheme for implementation. Finally, we use newly released Juno mission measurements, and present a global PINNs model for Jupiter's magnetic field, and a regional PINNs model for Great Blue Spot (GBS) region. Comparison with spherical harmonic model has been conducted to evaluate the correctness and flexibility of PINNs models.

How to cite: Chen, L., Livermore, P., Wu, L., de Ridder, S., and Zhang, C.: Modelling Jupiter's global and regional magnetic fields using physics-informed neural networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8430, https://doi.org/10.5194/egusphere-egu23-8430, 2023.

EGU23-8676 | Posters on site | ESSI1.3

Software solution for detecting asteroids using machine learning techniques

Victor Bacu

To accurately predict potential future impacts with the Earth, it is crucial to continuously examine the area around it for Near Earth Objects (NEOs) and particularly Near Earth Asteroids (NEAs). Large data sets of astronomical images must be analyzed in order to accomplish this task. NEARBY [1] offers such a processing and analysis platform based on Cloud computing. Despite the fact that this method is automated, the results are validated by human observers after potential asteroids have been identified from the raw data. It is crucial that the amount of candidate objects does not outweigh the available human resources. We believe we can maximize the advantages of having access to enormous amounts of data in the field of astronomy by combining artificial intelligence with the use of high-performance distributed processing infrastructures like Cloud-based solutions. This research is carried out as part of the CERES project which aims to design and put into practice a software solution that can classify objects found in astronomical images. The objective is to identify and recognize asteroids. We use machine learning techniques to develop an asteroid classification model in order to achieve this goal. It is essential to reduce the number of false negative findings. The major objective of the current paper is to assess how well deep CNNs perform when it comes to categorizing astronomical objects, particularly asteroids. We will compare the outcomes of several of the most well-known deep convolutional neural networks (CNNs), including InceptionV3, Xception, InceptionResNetV2, and ResNet152V2. These cutting-edge classification CNNs are used to investigate the best approach to this specific classification challenge, either through full-training or through fine-tuning.

Acknowledgment: This work was partially supported by a grant of the Romanian Ministry of Education and Research, CCCDI - UEFISCDI, project number PN-III-P2-2.1-PED-2019-0796, within PNCDI III. This research was partially supported by the project 38 PFE in the frame of the programme PDI-PFE-CDI 2021.

References:

1. Bacu, V., Sabou, A., Stefanut, T., Gorgan, D., Vaduvescu, O., NEARBY platform for detecting asteroids in astronomical images using cloud-based containerized applications, 2018 IEEE 14th International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 371-376

How to cite: Bacu, V.: Software solution for detecting asteroids using machine learning techniques, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8676, https://doi.org/10.5194/egusphere-egu23-8676, 2023.

EGU23-8946 | Posters on site | ESSI1.3

Using Artificial Intelligence/Computer Vision for Automated Plot Validation

Joey Mukherjee

EGU23-10654 | ECS | Orals | ESSI1.3

Predicting the 1 AU Arrival Time of Coronal Mass Ejections Based on Convolutional Neural Network

Yi Yang, Fang Shen, Yucong Li, and Rongpei Lin

Coronal mass ejections (CMEs) are one of the most violent solar eruptions, which can burst out large amounts of magnetized plasma with speeds up to thousands of kilometers per second. When it reaches the Earth, a CME can cause geomagnetic storm, affecting aviation safety, satellite operations, communications systems and power facilities. Therefore, fast and accurate prediction of CME arrival time is crucial for avoiding severe damaging effects and reducing economic losses. The initial morphology and kinematics of a CME in the corona can be observed by the coronagraphs equipped on the Solar and Heliospheric Observatory (SOHO), so that the coronagraphs should be useful to predict the CME arrival times. In this study, convolutional neural network (CNN) is used to obtain the features of SOHO/LASCO coronagraph pictures related to the CME transit time, and establish a model capable of predicting the CME arrival time. The influence of different hyperparameters of CNN on the prediction results is studied. Further, we add a physical information constraint of the initial velocities of CME to the basic CNN outputs, and found that smaller prediction errors can be obtained.

How to cite: Yang, Y., Shen, F., Li, Y., and Lin, R.: Predicting the 1 AU Arrival Time of Coronal Mass Ejections Based on Convolutional Neural Network, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10654, https://doi.org/10.5194/egusphere-egu23-10654, 2023.

EGU23-10705 | Orals | ESSI1.3

Advanced Multi-Instrument and Multi-Wavelength Image Processing and Feature Tracking for Remote CME Characterization with Convolutional Neural Network

Oleg Stepanyuk and Kamen Kozarev

Solar eruptive events are complex phenomena, which most often include solar flares, filament eruptions, coronal mass ejections (CMEs), and CME-driven shock waves. CME-driven shocks in the corona and interplanetary space are considered to be the main producer of solar energetic particles (SEPs). A number of fundamental questions remain about how SEPs are produced. Current understanding points to CME-driven shocks and compressions in the solar corona.

A CME kinematics shows three phases - an initial rising phase (weakly accelerated motion), an impulsive phase and a residual propagation phase with constant or decreasing speed.

Despite significant amount of data available from ground-based (COSMO K-Cor, LOFAR) and remote instruments onboard of heliospheric space missions (SDO AIA, SOHO), processing of the data still requires noticeable effort. Most algorithms currently used in solar feature detection and tracking are known for their limited applicability and complexity of their processing chains, while usage of data-driven approaches for tracking of CME-related phenomena is currently limited due to insufficiency of training sets.

Recently (Stepanyuk et.al, J. Space Weather Space Clim. Vol 12, 20(2022)), we have demonstrated the method and the software(https://gitlab.com/iahelio/mosaiics/wavetrack) for smart characterization and tracking of solar eruptive features based on the a-trous wavelet decomposition technique, intensity rankings and a set of filtering techniques. In this work we use Wavetrack to generate training sets for data-driven feature extraction and characterization. We utilize U-Net, a fully convolutional network which training strategy relies on the strong use of data augmentation to use the available annotated samples more efficiently. U-NET can be trained end-to-end from a very limited set of images, while feature engineering allows to improve this approach even further by expanding available training sets.

Here we present pre-trained models and demonstrate data-driven characterization and tracking of solar eruptive features on a set of CME-events.

How to cite: Stepanyuk, O. and Kozarev, K.: Advanced Multi-Instrument and Multi-Wavelength Image Processing and Feature Tracking for Remote CME Characterization with Convolutional Neural Network, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10705, https://doi.org/10.5194/egusphere-egu23-10705, 2023.

EGU23-11898 | ECS | Orals | ESSI1.3

Composition Analysis of an Apatite Crystal using a Space-Prototype Mass Spectrometric Instrument and Machine Learning for Unsupervised Mineralogical Phase Detection

Salome Gruchola, Marek Tulej, Peter Keresztes Schmidt, Rustam Lukmanov, Andreas Riedo, and Peter Wurz

We present the analysis of a 2.06 Ga apatite crystal obtained from an ultramafic phoscorite rock from the Phalaborwa Complex (Limpopo Province, South Africa) [1]. A space-prototype laser ablation ionisation mass spectrometer (LIMS) [2,3] was used to study the chemical composition of the sample. Mass spectra were recorded from a sample area of 0.6x0.6 mm², with a spatial resolution of 30 μm and sub-micrometre depth resolution.

Apatite is a calcium phosphate mineral expressed by the chemical stoichiometric formula [Ca₅(PO₄)₃(F, Cl, OH)]. The halogen site, occupied by F, Cl, and OH, corresponds to an isomorpous series with fluor-, chlor- and hydroxyl-apatite end members, respectively. Apatite, being an accessory mineral in igneous and other rocks, commonly contains a range of other elements that do not fit well into the major rock forming minerals, such as rare earth elements (REE). These are suitable targets for investigating physical and chemical conditions in igneous rocks and the volatile evolution of magmas.

The analysis of the spectra recorded with our LIMS system for the abundances of the elements of interest at each location were performed in two steps. First, the abundances of each element across the sampled area were compiled in element maps. And second, an unsupervised machine learning algorithm based on clustering and network analysis was applied to the data set of analysed mass spectra to separate it into groups of distinct chemical composition. Subsequently, a more detailed analysis was conducted on each of the recovered groups to assign the corresponding mineral. In addition to the group of spectra belonging to apatite, which was assigned to fluorapatite, other minerals were identified, amongst others olivine. This method yields an unsupervised approach to identify different mineralogical entities present within a sample. This network analysis method was previously applied to a 1.88 Ga Gunflint sample (Ontario, Canada) to separate spectra recorded from the host (chert) from spectra containing signatures of organic matter from fossilized microbes [4].

Given that the data were recorded using a miniature mass spectrometer designed for space flight, this analysis demonstrates the analytical capabilities of our LIMS system that could be achieved in-situ on other planetary bodies in our Solar System, for example on the Moon or on Mars. The current performance of this miniature LIMS instrument to study the chemical composition of apatite is sufficiently high to measure volatiles (H, F, Cl) and nearly all relevant mineral and partially trace elements (Na, C, Mg, Si, S, K, Mn, Fe, Sr, Ba), including REE (La, Ce, Pr, Sm) which allows for a systematic quantitative analysis of their distribution.

[1] Tulej, M. et al., 2022, https://doi.org/10.3390/universe8080410.

[2] Riedo, A. et al., 2012, https://doi.org/10.1002/jms.3104.

[3] Tulej, M. et al., 2021, https://doi.org/10.3390/app11062562.

[4] Lukmanov, R.A. et al., 2022, https://doi.org/10.3389/frspt.2022.718943

How to cite: Gruchola, S., Tulej, M., Keresztes Schmidt, P., Lukmanov, R., Riedo, A., and Wurz, P.: Composition Analysis of an Apatite Crystal using a Space-Prototype Mass Spectrometric Instrument and Machine Learning for Unsupervised Mineralogical Phase Detection, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11898, https://doi.org/10.5194/egusphere-egu23-11898, 2023.

EGU23-12354 | ECS | Posters on site | ESSI1.3

Key Signatures of Prominence Materials and Category of Unknown-origin Cold Materials identified by Machine Learning Classifier

Shuyi Meng, Shuo Yao, and Zexin Cheng

The origin of cold materials identified by different criteria is unclear. They are highly suspected to be the erupted prominence. However, some cold materials defined by charge depletion exist in both solar wind and ICMEs. Recently, solar observations show failed prominence eruption in CMEs that it did not propagate into the interplanetary space. Besides, the related prominence eruptions of the earth-directed ICMEs at 1 au are difficult to identify before the launch of STEREO mission. This work uses Random Forest (RF) that is an interpretable classifier of supervised machine learning to study the distinct signatures of prominence cold materials (PCs) compared to quiet solar wind (SW) and ICMEs. 12 parameters measured by ACE at 1 au are used in this study, which are proton moments, magnetic field component Bz, He/H, He/O, Fe/O, mean charge of oxygen and carbon, C⁶⁺/C⁵, C⁶⁺/C⁴⁺, and O⁷⁺/O⁶⁺. According to the returned weights from RF classifier and the training accuracy from one black box classifier, the most important in situ signatures of PCs are obtained. Next, the trained RF classifier is used to check the category of the origin-unknown cold materials in ICMEs. The results show that most of the cold materials are from prominence, but 2 of them are possibly from quiet solar wind. The most distinct signatures of PCs are lower charges of C and O, proton temperature, and He/O. This work provides quantitative evidence for the charges of C and O being most effective solid criteria. Considering the obvious overlaps on key parameters between SW, ICMEs, and PCs, multi-parameter classifier of machine learning show an advantage in separating them than solid criteria.

How to cite: Meng, S., Yao, S., and Cheng, Z.: Key Signatures of Prominence Materials and Category of Unknown-origin Cold Materials identified by Machine Learning Classifier, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12354, https://doi.org/10.5194/egusphere-egu23-12354, 2023.

EGU23-14927 | ECS | Orals | ESSI1.3

Automatically Calculating Depths of Martian and Lunar Pits with Satellite Imagery

Daniel Le Corre, Nigel Mason, Jeronimo Bernard-Salas, Nick Cox, and David Mary

Pits, or pit craters, are roughly circular depressions found in planetary surfaces which are generally formed through gravitational collapse. Pits will be primary targets for future space exploration and habitability for their presence on most rocky Solar System surfaces and their potential to be entrances to sub-surface cavities. This is particularly true on the Moon and Mars where future astronauts will also be exposed to high radiation dosages whilst on the surface. However, since pits are rarely found to have corresponding high-resolution elevation data, tools are required for approximating their depths in order to find those which are the ideal candidates for exploration and habitation.

We develop a tool that automatically calculates a pit’s apparent depth – the depth at the edge of its shadow - by measuring the shadow’s width as it appears in satellite imagery. The tool can produce a profile of the apparent depth along the entire length of the shadow, using just one cropped single- or multi-band image of a pit. Thus, allowing for the search for possible cave entrances to continue where altimetry or stereo image data is not available. Shadows are automatically extracted using k-means clustering with silhouette analysis for automatic cluster validation. We will present the results of testing the shadow extraction upon shadow-labelled Mars Reconnaissance Orbiter HiRISE imagery of Martian pits, as well as the findings of applying the tool to HiRISE images of Atypical Pit Craters (APCs) from the Mars Global Cave Candidate Catalog (MGC³) [1]. We will also present preliminary results of applying our tool to Lunar Reconnaissance Orbiter Narrow Angle Camera data taken of Lunar pits catalogued in the Lunar Pit Atlas [2].

[1] – Cushing et al. (2015). Atypical pit craters on Mars: New insights from THEMIS, CTX, and HiRISE observations, Journal of Geophysical Research: Planets, 120, 1023–1043

[2] – Wagner & Robinson (2021). Occurrence and Origin of Lunar Pits: Observations from a New Catalog, in 52nd Lunar and Planetary Science Conference, Lunar and Planetary Science Conference, p. 2530

How to cite: Le Corre, D., Mason, N., Bernard-Salas, J., Cox, N., and Mary, D.: Automatically Calculating Depths of Martian and Lunar Pits with Satellite Imagery, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14927, https://doi.org/10.5194/egusphere-egu23-14927, 2023.

EGU23-15160 | ECS | Orals | ESSI1.3

Detecting the magnetopause of Mercury by neural network — using MESSENGER data to train for BepiColombo.

Lukas Maes, Markus Fraenz, and Daniel Heyner

The BepiColombo mission will arrive at Mercury in 2025. It consists of two spacecraft, which both have a magnetometer on board. One of the science objectives of these instruments is to study the structure of Mercury’s magnetosphere and its dynamical interaction with the solar wind. To study this statistically, a large dataset of observations of the magnetopause (the magnetosphere’s outer boundary) is needed. However, identifying such magnetopause crossings in magnetic field data requires visual inspection by humans with expert knowledge and as such is a very time consuming process. We therefore design an algorithm to automatically detect the Hermean magnetopause in magnetometer time series data, making use of a convolutional neural network.

Since no BepiColombo data (in orbit) is available yet, we train the network on MESSENGER magnetometer data. However, we formulate the problem and design the architecture of the network in such a way that the algorithm should be easily transferable to BepiColombo magnetometer data, avoiding the possible impact of any instrumental particularities or orbital biases.

The goal is to have a neural network which is directly applicable to BepiColombo magnetometer data, as soon as the observations start and without any further training, thereby eliminating the necessity of manually creating a new dataset of BepiColombo magnetopause crossings.

How to cite: Maes, L., Fraenz, M., and Heyner, D.: Detecting the magnetopause of Mercury by neural network — using MESSENGER data to train for BepiColombo., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15160, https://doi.org/10.5194/egusphere-egu23-15160, 2023.

EGU23-16941 | ECS | Orals | ESSI1.3

Mars Perseverance Panoramic Image for Self-Determination Mission Algorithm

Okta Bramantio Swida, Bernard Foing, and Constantijn Vleugels

Aiming to unravel the astrobiology of Mars, the Perseverance mission came with a lot of unknowns. With the surface level knowledge that we have already known, The High Resolution Imaging Science Experiment (HiRISE) can already determine the observation or experimental sites through the images generated from the orbiter. Although the resolution is high, with the power of a 1-meter-size object determinator, we can always expect so much more from the ground-level observation.

The Mars Perseverance rover is equipped with a pair of Mastcam-Z set cameras that are equipped in a manner to simulate the human eye for depth determination in image processing. The instruments can process stereo colour images of the ground level. These images can be used to make detailed maps of the Mars surface scenery at ground level with high precision.

Building and analyzing these images can take days to process on Earth manually. But if we utilise machine learning tools and onsite computation, it might save a lot of time for the mission. The current model used in the Mars Perseverance is the AutoNav Mark 4 with a lot of tasks, including spacecraft positioning, in-flight orbit determination, target tracking, and ephemeris calculations. All those might be computationally expensive to process. Therefore, the aim of this research is to develop a simple algorithm to do object and slope determinations to feed into an autonomous path determination process. The data fed into the algorithm are panoramic images captured by the MastCam-Z mounted on Mars Perseverance.

How to cite: Swida, O. B., Foing, B., and Vleugels, C.: Mars Perseverance Panoramic Image for Self-Determination Mission Algorithm, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16941, https://doi.org/10.5194/egusphere-egu23-16941, 2023.

EGU23-2909 | Posters on site | ESSI1.5

Long-Term Forecasting of Environment variables of MERRA2 based on Transformers

Tsengdar Lee, Sujit Roy, Ankur Kumar, Rahul Ramachandran, and Udaysankar Nair

Transformers in general have shown great promise in the sequence modeling. Recently proposed vision transformer (ViT) by Dosovitskiy et al. has shown optimal performance in image recogining [1]. Fourier Neural operator based token mixer transformers keeping ViT as backbone was proposed by Guibas et. al. has been used for predicting wind and precipitation on ERA5 dataset[2,3]. Following the previous work, we trained the Fourcastnet from scratch on the MERRA2 data set with 3 verticle levels (z450, z500, z550) and 11 variables (adding u, v, and temp). We trained on data from 2005 to 2015 and made predictions by providing the initial conditions from 2017. The prediction was made for 7 days in advance. For the first 24 hours model prediction, mean correlation was 0.998. Root mean squared error (RMSE) 6 hours prediction was 8.779 and for 24 hours was 19.581 on a scale range of -575.6 to 330.6. The model was further tested on 11 variables on the same training data to evaluate prediction of major events like Hurricane. Initial condition for category 5 Hurricane Sep 28, 2016 – Oct 10, 2016 was given to the model. The model was able to predict the hurricane for 18 hours. Further work will be done in order to tune to model and increase more environment variables from MERRA2 to make the prediction more robust and for a longer period.

References:
1. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold,
G., Gelly, S. and Uszkoreit, J., 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv
preprint arXiv:2010.11929.
2. Guibas, J., Mardani, M., Li, Z., Tao, A., Anandkumar, A. and Catanzaro, B., 2021, September. Efficient Token Mixing for
Transformers via Adaptive Fourier Neural Operators. In International Conference on Learning Representations
3. Pathak, J., Subramanian, S., Harrington, P., Raja, S., Chattopadhyay, A., Mardani, M., Kurth, T., Hall, D., Li, Z.,
Azizzadenesheli, K. and Hassanzadeh, P., 2022. Fourcastnet: A global data-driven high-resolution weather model using
adaptive fourier neural operators. arXiv preprint arXiv:2202.11214.

How to cite: Lee, T., Roy, S., Kumar, A., Ramachandran, R., and Nair, U.: Long-Term Forecasting of Environment variables of MERRA2 based on Transformers, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2909, https://doi.org/10.5194/egusphere-egu23-2909, 2023.

EGU23-2944 | Orals | ESSI1.5

Foundation AI Models for Science

Manil Maskey, Rahul Ramachandran, Tsengdar Lee, and Raghu Ganti

Foundation Models (FM) are AI models that are designed to replace a task or an application specific model. These FM can be applied to many different downstream applications. These FM are trained using self supervised techniques and can be built on any type of sequence data. The use of self supervised learning removes the hurdle for developing a large labeled dataset for training. Most FM use transformer architecture utilizes the notion of self attention which allows the network to model the influence of distant data points to each other both in space and time. The FM models exhibit emergent properties that are induced from the data.

FM can be an important tool for science. The scale of these models results in better performance for different downstream applications and these applications show better accuracy over models built from scratch. FM drastically reduces the cost of entry to build different downstream applications both in time and effort. FM for selected science datasets such as optical satellite data, can accelerate applications ranging from data quality monitoring, feature detection and prediction. FM can make it easier to infuse AI into scientific research by removing the training data bottleneck and increasing the use of science data.

How to cite: Maskey, M., Ramachandran, R., Lee, T., and Ganti, R.: Foundation AI Models for Science, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2944, https://doi.org/10.5194/egusphere-egu23-2944, 2023.

EGU23-5443 | Orals | ESSI1.5

Earth System Deep Learning towards a Global Digital Twin of Wildfires

Ioannis Prapas, Ilektra Karasante, Akanksha Ahuja, Spyros Kondylatos, Eleanna Panagiotou, Charalampos Davalas, Lazaro Alonso, Rackhun Son, Michail Dimitrios, Nuno Carvalhais, and Ioannis Papoutsis

Due to climate change, we expect an exacerbation of fire in Europe and around the world, with major wildfire events extending to northern latitudes and boreal regions [1]. In this context, it is important to improve our capabilities to anticipate fire danger and understand its driving mechanisms at a global scale. As the earth is an interconnected system, large-scale processes can have an effect on the global climate and fire seasons. For example, extreme fires in Siberia have been linked to previous-year surface moisture conditions and anomalies in the Arctic Oscillation [2]. As part of the ESA-funded project SeasFire (https://seasfire.hua.gr), we gather and harmonize data related to seasonal fire drivers and develop deep learning models that are able to capture spatiotemporal associations with the goal to forecast burned area sizes on a seasonal scale, globally. We publish a global analysis-ready datacube for seasonal fire forecasting for the years 2001-2021 at a spatiotemporal resolution of 0.25 deg x 0.25 deg x 8 days [3]. The datacube includes a combination of variables describing the seasonal fire drivers , namely climate, vegetation, oceanic indices, human factors, land cover and the burned areas. We leverage the availability of big EO data and the advances in Deep Learning modeling [4, 5] to forecast global burned areas, capture the spatio-temporal interactions of the Earth System variables and identify potential teleconnections that determine wildfire regimes under the light of climate change. We present deep learning models that handle the Earth as a system, such as graph neural networks and transformer-based architectures. Applied on the prediction of wildfires at different temporal horizons we reveal that our deep learning models skillfully predict burned area patterns. Exploring the explanation of the models, we reveal important spatio-temporal links.

Our approach, using AI to model the earth as a system and capture long spatio-temporal interactions, showcases the potential of an application-specific digital twin. The SeasFire datacube can be exploited as a baseline digital twin for modeling different natural hazards, including floods, heatwaves, and droughts. Thus, we will discuss insights and future directions for digital twins in anticipating climate extremes, inspired by our global wildfire prediction paradigm.

[1] Wu, Chao, et al. "Historical and future global burned area with changing climate and human demography." One Earth 4.4 (2021): 517-530.

[2] Kim, Jin-Soo, et al. "Extensive fires in southeastern Siberian permafrost linked to preceding Arctic Oscillation." Science advances 6.2 (2020): eaax3308.

[3] Alonso, Lazaro, et al. Seasfire Cube: A Global Dataset for Seasonal Fire Modeling in the Earth System. Zenodo, 15 July 2022, p., doi:10.5281/zenodo.6834584.

[4] Kondylatos, Spyros et al. “Wildfire Danger Prediction and Understanding with Deep Learning.” Geophysical Research Letters, 2022. doi: 10.1029/2022GL099368

[5] Prapas, Ioannis et al. “Deep Learning for Global Wildfire Forecasting.” NeurIPS 2022 workshop on Tackling Climate Change with Machine Learning, doi: 10.48550/arXiv.2211.00534

How to cite: Prapas, I., Karasante, I., Ahuja, A., Kondylatos, S., Panagiotou, E., Davalas, C., Alonso, L., Son, R., Dimitrios, M., Carvalhais, N., and Papoutsis, I.: Earth System Deep Learning towards a Global Digital Twin of Wildfires, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5443, https://doi.org/10.5194/egusphere-egu23-5443, 2023.

EGU23-5746 | Orals | ESSI1.5

Confidence estimation of DNN predictions for on-board applications

Nicolas Dublé, François De Vieilleville, Adrien Lagrange, and Bertrand Le Saux

Most of the DNNs are designed to predict a class, a segmentation map or detections, no matter it is interpolation or extrapolation. Then, a confidence score answers to the need of having interpretable outputs and it could help an AI4EO end-user to take a decision.

The first investigated use case was binary classification of small Sentinel-2 tiles containing ships or not (with 2 classes “tile containing ship” or “tile not containing ship”). The database gathered 16,947 small 140x140 tiles extracted from 37 Sentinel-2 products. The ground truth was generated using Danish AIS data and then checked by human-eye. It was divided into several datasets for training, validation, testing, and active learning.

The second investigated use case was the classification of 10 geophysical phenomena from Sentinel-1 wave mode [Wang et al, 2018]. The database gathered 30,032 images with a quite balanced repartition between the 10 classes.

Classification networks (VGG16) were trained on the training datasets of both use cases, reaching high performances (>95% accuracy). We added several Out Of Distribution (OOD) examples for the ship classification use case, and used the test database provided for the Ocean Features use cases. Models reach around 70% accuracy on these 2 harder datasets so that regressing confidence could have an interest, with many examples of wrong classifications.

The solution developed used the confidNet approach developed by Corbière et al. Without retraining the classification DNN, we added a second DNN, composed by several dense layers, taking latent space from the classification network as input, which objective was to estimate a confidence score, by trying to approach the True Class Probability. It proved to be easy to train when enough failure examples are available in the database.

The main objective of the confidNet is to find the “ID”/”OOD” boundary, qualifying which examples the classifier should be able to predict (interpolation), and those it should fail to predict (extrapolation). An important work was done to try to qualify the quality of the predictions of the confidNet (confidence score) to ensure that it didn’t just learn to map the subset of the dataset where the classifier fails, and the one where the classifier was right. It presented interesting properties of generalization and turned out to be less “dataset-dependent” than a classical DNN.

21 different network configurations were tested, making the size of the architecture vary from 4k to 2.5M parameters. It showed that many of these configurations could reach similar results, and that the number of layers was more decisive than the number of parameters of the intermediate feature maps.

The main results obtained in this study are the relevance to utilize the confidNet approach in AI4EO scenarios, the possibility to reduce the network in an on-boarding interest, and a first warranty that the confidNet approach can learn in a different way from classification networks, with interesting properties of generalization. This study demonstrates the possibility to associate confidence scores to the predictions of a DNN in a satisfying way.

How to cite: Dublé, N., De Vieilleville, F., Lagrange, A., and Le Saux, B.: Confidence estimation of DNN predictions for on-board applications, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5746, https://doi.org/10.5194/egusphere-egu23-5746, 2023.

EGU23-6060 | ECS | Posters on site | ESSI1.5

A machine learning-powered Digital Twin for extreme weather events analysis

Gabriele Accarino, Donatello Elia, Davide Donno, Francesco Immorlano, and Giovanni Aloisio

In recent years, Climate Change has been leading to an exacerbation of Extreme Weather Events (EWEs), such as storms and wildfires, raising major concerns in terms of their increase of their intensity, frequency and duration. Detecting and predicting EWEs is challenging due to the rare occurrence of these events and consequently the lack of related historical data. Additionally, gathering of data when the event manifests is not a straightforward process, due to the intrinsic difficulty of positioning and using acquisition systems. Advances in Machine Learning (ML) can provide cutting-edge modeling techniques to deal with EWE detection and prediction tasks, offering cost-effective and fast-computing solutions which are strongly required by policy makers for taking timely and informed actions in the presence of EWEs.

Solutions based on ML could, thus, support studies of such extreme events, providing scientists, policy makers and also the general public with powerful and innovative data-driven tools. However, from an infrastructural point of view, supporting such types of applications requires a wide set of integrated software components including data gathering and harmonisation pipelines, data pre-processing and augmentation modules, computing platforms for model training, results visualization tools, etc.

A Digital Twin for the analysis of extreme weather events, focusing on storms and wildfires, is being developed in the context of the EU-funded InterTwin project. The InterTwin project aims at defining a Digital Twin Engine for supporting scientific applications from different fields. In particular, for the EWEs, neural networks are being adopted as modeling tools capable of learning the underlying mapping between drivers and outcomes from past data and generalizing it to future projection data. This contribution will present the early concept behind the design of this machine learning-powered Digital Twin for EWE studies.

How to cite: Accarino, G., Elia, D., Donno, D., Immorlano, F., and Aloisio, G.: A machine learning-powered Digital Twin for extreme weather events analysis, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6060, https://doi.org/10.5194/egusphere-egu23-6060, 2023.

EGU23-8777 | ECS | Orals | ESSI1.5

Deep Learning for Verification of Earth's surfaces

Margarita Choulga, Tom Kimpson, Matthew Chantry, Gianpaolo Balsamo, Souhail Boussetta, Peter Dueben, and Tim Palmer

Ever increasing computing capabilities and crave for high-resolution numerical weather prediction and climate information are specially interesting for the representation of Earth surfaces. Knowledge of accurate and up-to-date surface state for ecosystems such as forest, agriculture, lakes and cities strongly influence the skin temperatures, turbulent latent and sensible heat fluxes providing the lower boundary conditions for energy and moisture availability near the surface. A quick and automatic tool to assess the benefits of updating different surface fields, that makes use of a neural network regression model trained to simulate satellite observed surface skin temperatures, was developed. This tool was deployed to determine the accuracy of several global datasets for lakes, forest, and urban distributions. Comparison results will be shown. The neural network regression model has proven to be useful and easily adaptable to assess unforeseen impacts of ancillary datasets, also detecting erroneous regional areas over the globe, proving to be a valuable support to model development.

How to cite: Choulga, M., Kimpson, T., Chantry, M., Balsamo, G., Boussetta, S., Dueben, P., and Palmer, T.: Deep Learning for Verification of Earth's surfaces, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8777, https://doi.org/10.5194/egusphere-egu23-8777, 2023.

EGU23-10092 | Posters on site | ESSI1.5

An Information Management Framework for Environmental Digital Twins (IMFe) as a concept and pilot

Justin Buck, Andrew Kingdon, John Siddorn, Gordon Blair, Alexandra Kokkinaki, John Blower, Matt Fry, Ben Marchant, Sam Pepler, John Watkins, and James Byrne

Environmental science is concerned with assessing the impacts of changing environmental conditions upon the state of the natural world. Environmental Digital Twins (EDT) are a new technology that enable environmental change scenarios for real systems to be modelled and their impacts visualised. They will be particularly effective with delivering understanding of these impacts on the natural environment to non-specialist stakeholders.

The UK Natural Environment Research Council (NERC) recently published its first digital strategy, which sets out a vision for digitally enabled environmental science for the next decade. This strategy places data and digital technologies at the heart of UK environmental science.

EDT have been made possible by the emergence of increasingly large, diverse, static data sources, networks of dynamic environmental data from sensor networks and time-variant process modelling. Once combined with visualisation capabilities these provide the basis of the digital twin technologies to enable the environmental scientists community to make a step-change in understanding of the environment. Components may be developed separately by a network but can be combined to improve understanding provided development follows agreed standards to facilitate data exchange and integration.

Replicating the behaviours of environmental systems is inevitably a multi-disciplinary activity. To enable this, an information management framework for Environmental digital twins (IMFe) is needed that establishes the components for effective information management within and across the EDT ecosystem. This must enable secure, resilient interoperability of data, and is a reference point to facilitate data use in line with security, legal, commercial, privacy and other relevant concerns. We present recommendations for developing an IMFe including the application of concepts such as an asset commons and balanced approach to standards to facilitate minimum interoperability requirements between twins while iteratively implementing an IMFe. Achieving this requires components to be developed that follow agreed standards to ensure that information can be trusted by the user, and that they are semantically interoperable so data can be shared. A digital Asset Register will be defined to provide access to and enable linking of such components.

This previously conceptual project has now been enhanced into the Pilot IMFe project aiming to define the architectures, technologies, standards and hardware infrastructure to develop a fully functioned environmental digital twin. During the project lifespan this will be tested with by construction of a pilot EDT for the Haig Fras Marine Conservation Zone (MCZ) that both enables testing of the proposed IMFe concepts and will provide a clear demonstration of the power of EDT to monitor and scenario test a complex environmental system for the benefit of stakeholders.

How to cite: Buck, J., Kingdon, A., Siddorn, J., Blair, G., Kokkinaki, A., Blower, J., Fry, M., Marchant, B., Pepler, S., Watkins, J., and Byrne, J.: An Information Management Framework for Environmental Digital Twins (IMFe) as a concept and pilot, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10092, https://doi.org/10.5194/egusphere-egu23-10092, 2023.

EGU23-10138 | Orals | ESSI1.5 | Highlight

Open-Source Framework For Earth System Digital Twins Applied to Surface Water Hydrology

Thomas Huang and the NASA AIST IDEAS and SCO FloodDAM Teams

An Earth System Digital Twin (ESDT) is a dynamic, interactive, digital replica of the state and temporal evolution of Earth systems. It integrates multiple models along with observational data, and connects them with analysis, AI, and visualization tools. Together, these enable users to explore the current state of the Earth system, predict future conditions, and run hypothetical scenarios to understand how the system would evolve under various assumptions. The NASA Advanced Information Systems Technology (AIST) program’s Integrated Digital Earth Analysis System (IDEAS) project partners with the Space for Climate Observatory (SCO) (https://www.spaceclimateobservatory.org/) FloodDAM Digital Twin effort led by CNES to establish an extensible open-source framework to develop digital twins of our physical environment for Earth Science with an initial focus on surface water hydrology in Earth’s rivers and lakes. The joint effort delivers an open-source system architecture with mechanisms for the outputs of one model to feed into others, for driving models with observation data, and for harmonizing observation data and model outputs for analysis. Water resource science is multidisciplinary in nature, and it not only assesses the impact from our changing climate using measurements and modeling, but it also offers opportunities for science-guided, data-driven decision support. The joint effort uses flood prediction and analysis as its primary use case. The work presents a multi-agency joint effort to define and develop a federated Earth System Digital Twin solution between NASA and CNES that powers advanced immersive science and custom user applications for scenario-based analysis.

How to cite: Huang, T. and the NASA AIST IDEAS and SCO FloodDAM Teams: Open-Source Framework For Earth System Digital Twins Applied to Surface Water Hydrology, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10138, https://doi.org/10.5194/egusphere-egu23-10138, 2023.

EGU23-10488 | ECS | Orals | ESSI1.5

Statistical downscaling of precipitation with deep neural networks

Bing Gong, Yan Ji, Michael Langguth, and Martin Schultz

Accurate weather predictions are essential for many aspects of social society. Providing a reliable high-resolution precipitation field is essential to capture the finer scale of heavy precipitation events, which is normally poorly represented in the numerical models. Statistical downscaling is an appealing tool since it is computationally inexpensive. Thus, it has been widely used over the last three decades. In recent years, super-resolution with deep learning has been successfully applied to generate high-resolution from low-resolution images in the computer vision domain. This task is somewhat analogous to downscaling in the meteorological domain.

Inspired by this, we explore the use of deep neural networks with a super-resolution approach for statistical precipitation downscaling. We apply the Swin transformer architecture (SwinIR) as well as convolutional neural network (U-Net) with a Generative Adversarial Network (GAN) and a diffusion component for probabilistic downscaling. We use short-range forecasts from the Integrated Forecast System (IFS) on a regular spherical grid with Δx_IFS=0.1° and map to the high-resolution observation radar data RADKLIM (Δx_RK=0.01°). The neural networks are fed with nine static and dynamic predictors, similar to the study by Harris et al., 2022. All the models are comprehensively evaluated by grid point-level errors as well as error metrics for spatial variability and the generated probability distribution. Our results demonstrate that the Swin Transformer model can improve accuracy with lower computation cost compared to the U-Net architecture. The GAN and diffusion models both further help the model to capture the strong spatial variability from the observed data. Our results encourage further development of DNNs that can be potentially leveraged to downscale other challenging Earth system data, such as cloud cover or wind.

How to cite: Gong, B., Ji, Y., Langguth, M., and Schultz, M.: Statistical downscaling of precipitation with deep neural networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10488, https://doi.org/10.5194/egusphere-egu23-10488, 2023.

EGU23-11489 | Posters on site | ESSI1.5

Towards a benchmark dataset for statistical downscaling of meteorological fields

Michael Langguth, Bing Gong, Yan Ji, Martin G. Schultz, and Olaf Stein

The representation of the atmospheric state at high spatial resolution is of particular relevance in various domains of Earth science. While global reanalysis datasets such as ERA5 provide comprehensive repositories of meteorological data, their spatial resolution (∆x≥25 km) is too coarse to capture relevant local features, mainly over complex terrain (e.g. cold pools in valleys, low-level jets, local heavy precipitation events).
Recently, various studies have started to apply deep neural networks adapted from computer vision to increase the spatial resolution of meteorological fields. Although these studies reveal great potential in the domain of statistical downscaling, intercomparison of the approaches is impeded due to a large variety of methods and deployed datasets. Comparisons to classical downscaling methods developed for decades in the meteorological community are also often underrepresented.

Inspired by the available benchmark datasets for various computer vision tasks and for weather forecasting (e.g. WeatherBench and WeatherBench Probability), our study aims to provide a benchmark dataset for statistical downscaling of meteorological fields. We choose the coarse-grained ERA5 reanalysis (∆x_ERA5≃30 km) and the fine-scaled COSMO-REA6 (∆x_CREA6≃6km) as input and target datasets. Both datasets enable the formulation of a real downscaling task: super-resolve the data and correct for model biases.
The benchmark dataset provides a collection of predictors and predictands for a couple of standard downscaling tasks. These comprise downscaling of the 2m temperature, the surface irradiance, the near-surface wind field and precipitation. Along with the dataset, benchmark deep neural networks, namely variants of U-Nets and GANs, will be provided. Well-chosen sets of evaluation metrics including baseline scores of the benchmarked deep neural networks are presented to enable comparison between different methods.
The envisioned benchmark dataset will provide a comprehensive basis for comparing neural network approaches on statistical downscaling of meteorological fields. This, in turn, is considered to enhance confidence and transparency in the application of deep learning methods on Earth system problems.

How to cite: Langguth, M., Gong, B., Ji, Y., Schultz, M. G., and Stein, O.: Towards a benchmark dataset for statistical downscaling of meteorological fields, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11489, https://doi.org/10.5194/egusphere-egu23-11489, 2023.

EGU23-12333 | ECS | Posters virtual | ESSI1.5

An AI hybrid predictive tool for extreme hurricane forecasting

Javier Martinez Amaya, Cristina Radin, Veronica Nieves, Nicolas Longépé, and Jordi Muñoz-Marí

Hurricanes, and more generally tropical cyclones, are among the most destructive natural hazards, and are arguably changing under climate change influences. Applying the power of AI to predict the extreme behavior of these events could be key to helping minimize hurricane damage. AI tools are a significant opportunity to: 1) identify non-linear relationships between changing hurricane-related characteristics and tropical storm intensification, and 2) anticipate responses to these changes. Another key part of this AI-based system is uncertainty quantification for decision-making processes. In this context, we present an improved ML hybrid model for predicting the development of extreme hurricane events, which includes effective information on spatio-temporal evolution variations of structural parameters extracted from IR satellite images. This approach, which combines Convolutional Neural Networks (CNNs) and a Random Forest (RF) classification framework, has been trained/tested with data from 1995 over the North Atlantic and NorthEast Pacific regions. Results from the CNN-RF model shows a performance of 80% or better considering lead-times of up to three days ahead (every 6 hours). With the proposed configuration, the overall precision has increased by at least 8%. This model could be yet further improved with the inclusion of new variables linked to environmental factors to be progressively explored.

How to cite: Martinez Amaya, J., Radin, C., Nieves, V., Longépé, N., and Muñoz-Marí, J.: An AI hybrid predictive tool for extreme hurricane forecasting, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12333, https://doi.org/10.5194/egusphere-egu23-12333, 2023.

EGU23-13106 | Posters virtual | ESSI1.5

An Unsupervised Anomaly Detection Problem in Urban InSAR-PSP Long Time-series

Ridvan Kuzu, Yi Wang, Octavian Dumitru, Leonardo Bagaglini, Giorgio Pasquali, Filippo Santarelli, Francesco Trillo, Sudipan Saha, and Xiao Xiang Zhu

Interferometric Synthetic Aperture Radar satellite measurements are an effective tool for monitoring ground motion with millimetric resolution over long periods of time. The Persistent Scatterer Pair method, developed in [1], is particularly useful for detecting differential displacements of buildings at multiple positions with few assumptions about the background environment. As a result, anomalous behaviours in building motion can be detected through PSP time series, which are commonly used to perform risk assessments in hazardous areas and diagnostic analyses after damage or collapse events. However, current autonomous early warning systems based on PSP-InSAR data are limited to detecting changes in linear trends and rely on sinusoidal and polynomial models [2]. This can be problematic if background signals exhibit more complex behaviours, as anomalous displacements may be difficult to identify. To address this issue, we propose an unsupervised anomaly detection method using Artificial Intelligence algorithms to identify potentially anomalous building motions based on PSP long time-series data.

To identify anomalous building motions, we applied two different AI algorithms based on Long Short-Term Memory Autoencoder inspired by [3] and a Graph Neural Network version of it. LSTM Autoencoder is an unsupervised representation learning framework that captures data representations by reconstructing the correct order of shuffled time series. Its encoder part is used to extract feature representations of a time series, while the decoder part is used to reconstruct the time series. By assuming that most stable samples exhibit similar temporal changes, this algorithm can be used for anomaly detection (as the reconstruction loss would be high for anomalous time series).

The data used in this study were provided by the European Ground Motion Service over a rectangular area surrounding the city of Rome and includes approximately 500.000 time-series aggregated over more than 80.000 buildings. The time period covered is from 2015 to 2020.

In our proposed approach, we first extract deep feature representations for each timestamp of a non-anomalous time series. The feature sequence is then shuffled and passed through an LSTM encoder-decoder network. By learning to reconstruct the feature sequence with the correct order, the network is able to recognize high-level representations of the time series. In the second step, the pre-trained network is used to reconstruct another time series. If the time series is non-anomalous, the correct order can be reconstructed with high confidence; otherwise, it is difficult to reconstruct the correct order. By selecting an appropriate threshold, anomalies can be detected with high reconstruction losses.

Overall, our proposed AI-based approach shows promising results for identifying anomalous building motions in PSP long time-series data. The use of unsupervised learning allows for more accurate statistical representations of the data and more reliable detection of anomalous behaviours. This approach has the potential to improve autonomous early warning systems for risk assessments and diagnostic analyses in dangerous areas.

This work is part of the RepreSent project funded by the European Space Agency (NO:4000137253/22/I-DT).

[1] https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4779025
[2] https://www.mdpi.com/2072-4292/10/11/1816/pdf
[3] https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9307226

How to cite: Kuzu, R., Wang, Y., Dumitru, O., Bagaglini, L., Pasquali, G., Santarelli, F., Trillo, F., Saha, S., and Zhu, X. X.: An Unsupervised Anomaly Detection Problem in Urban InSAR-PSP Long Time-series, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13106, https://doi.org/10.5194/egusphere-egu23-13106, 2023.

EGU23-13921 | Orals | ESSI1.5 | Highlight

Towards a local, dated and thematic digital twins factory

Jean-Marc Delvit, Pierr-Marie Brunet, Pierre Lassalle, Dimitri Lallement, and Simon Baillarin

The notion of digital twin can be ambiguous because it can be defined in various ways. These last months have seen the emergence of many global digital twin initiatives. The challenge of these global digital twins is to create a qualified digital replica model of our planet, making it possible to monitor, simulate and anticipate natural phenomena and human activities. The target users are either scientists or decision makers. Through the digital twin, they have access to a digital representation of an environment using all available spatial and non-spatial data accompanied with a set of physical and statistical models to calculate projections, replay past events or simulate future ones.

Refining and evaluating the accuracy of these projections is a major challenge for digital twins. In addition to the knowledge of physical modeling, suitable data must also be available. Complementary to the global approach, the notion of local and dated digital twins appears then to be essential. Considering a digital representation of a restricted geographical area of interest (an urban area, watershed, coastline, etc.) allows to access to very high-resolution "fresh" data in 2D and 3D, in-situ data and small-mesh physical model. This user-centered and naturally thematic approach responds more finely and more pragmatically to the objectives presented. These local, dated and thematic digital twins are by essence ephemeral: a way to meet a specific need.

The challenge is therefore to setup a Digital Twin Factory (DTF). This DTF relies on a data lake, a high computing capacity via clouds and/or HPC and has thematic algorithms and methodologies able to generate registered and coherent layers of information in order to enrich a datacube from which physical indicators can be computed spatially. Thanks to its thematic, local and on-demand characteristics, the DTF can mitigate the need to have an universal model of metadata. This datacube allows to apply local physical and artificial intelligence models. The overarching architecture of the DTF will be presented. Specific examples on coastal, urban and risk topics will also be presented. These digital twins rely on a large number of expertises in both data and modeling involving various French (CNES, IGN, SHOM, IRD, CEA, INRAE, METEOFRANCE, CERFACS, BRGM, etc.) or international organizations (ESA, NASA, NOAA,…).

For coastal areas, the goal is to well describe the bathymetry topography continuum by taking into account the intertidal zones and the specialized dynamic models together with 3D coastal land cover characterisation. For urban areas, the ambition is first to automatically produce a qualified 3D map together with its additional layers of information: 3D objects and related semantics (land cover and land use) including temporal dynamic, thermal information. Then, for issues related to the management of natural risks (such as floods or fires) similar data layers can be used. Finally, new hypothesis can be injected in these digital replica and multiple scenarios can be applied to assess causal relationship between hypothesis and prediction. Very promising results will also be presented.

How to cite: Delvit, J.-M., Brunet, P.-M., Lassalle, P., Lallement, D., and Baillarin, S.: Towards a local, dated and thematic digital twins factory, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13921, https://doi.org/10.5194/egusphere-egu23-13921, 2023.

EGU23-14998 | Posters on site | ESSI1.5

Digital twincubator eWaterCycle

Niels Drost, Peter Kalverla, Bart Schilperoort, Barbara Vreede, Sarah Alidoost, Stefan Verhoeven, Yang Liu, and Rolf Hut

Recently there’s been a lot of enthousiasm for the concepts of digital twins, virtual research environments, serious games, and other inspiring ideas to improve “the way we do science.” With eWaterCycle, we are no stranger to the cause. We’ve worked hard to build a platform that could make the scientific process – specifically, hydrological modelling – more accessible and engaging.

eWaterCycle gives users access to a centralized platform where they can perform hydrological experiments: simulating how water flows through a catchment area of choice. Complete with data, a suite of models, an interactive scripting environment, and a graphical explorer to quickly setup an experiment. It shares many characteristics with what is commonly understood of a digital twin. But is it, really?

Sadly, the concept of digital twins suffers from linguistic inflation. At a recent event on the topic, the main coffee chatter was along the lines of “but what actually is it?” In an arena filled with resonating buzz, a clear image can help to regain focus and a common frame of reference. Is eWaterCycle, as a platform that supports working with each other’s models and data, a digital twin? Or is it more an incubator of digital twins? Either way, eWaterCycle can help make things concrete and specific, because it already exists.

With several new projects promising to build digital twins of all sorts, we hope our experience can feed into the discussions on and development of new digital twins. Therefore, at EGU, we would like to reflect on the essence of our platform and our experience in building it. What is it (not)? What’s in it for you? What challenges did we face? And what does that mean for open science and collaborative research?

How to cite: Drost, N., Kalverla, P., Schilperoort, B., Vreede, B., Alidoost, S., Verhoeven, S., Liu, Y., and Hut, R.: Digital twincubator eWaterCycle, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14998, https://doi.org/10.5194/egusphere-egu23-14998, 2023.

EGU23-15688 | Posters virtual | ESSI1.5

Digital Twins of the Ocean – Opportunities to Inform Sustainable Ocean Governance

Joana Kollert, Martin Visbeck, and Ute Brönner

Recent advances in High Performance Computing and Earth System Model resolution have enabled the Earth Science community to envision Digital Twins as an innovative approach to global environmental problems. This is also true of the Ocean Science community.

A Digital Twin of the Ocean (DTO) merges marine system models with observational data and machine learning analytics to produce a digital replica of the real ocean. In addition to natural phenomena, DTOs can include socio-economic factors (e.g. ocean-use, pollution). Thus, DTOs can be used to monitor the current ocean state, but also to simulate future ‘what-if’ scenarios for varying human interventions. Another benefit of DTOs is that they can be used by a variety of stakeholders: by scientists to understand the ocean, by policymakers to make well-informed decisions, and by citizens to improve ocean literacy. As such, DTOs are a powerful tool in future-proofing sustainable development. Moreover, they provide strong motivation to improve the marine data landscape and build an interoperable system with agreed upon formats and standards. DTOs are tailored to a specific ocean area or purpose, such that a DTO framework is needed to implement data connectivity and interoperability, ease of access, standards and to highlight gaps. The UN Ocean Decade Program DITTO aims to provide such a framework. Specifically, DITTO advances worldwide collaboration between scientists, data and IT experts to develop a common understanding of DTOs, to establish best practices in their development, and to advance a digital framework for DTOs to empower ocean professionals from all sectors around the world to effectively create their own digital twins.

DTOs offer the technology for building a social-ecologically integrated ocean ecosystem with observation- and modelling networks that support sustainable ocean governance.

How to cite: Kollert, J., Visbeck, M., and Brönner, U.: Digital Twins of the Ocean – Opportunities to Inform Sustainable Ocean Governance, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15688, https://doi.org/10.5194/egusphere-egu23-15688, 2023.

EGU23-973 | Posters virtual | HS3.5

Clustering aggregation model for statistical forecasting of multiphase flow problems

Qinzhuo Liao

Reservoir simulations often require statistical predictions to quantify production uncertainty or assess potential risks. Most existing uncertainty quantification procedures aim to decompose the input random field into independent random variables if the correlation scale is small compared to the domain size. In this work, we develop a K-means-based aggregation model, for efficiently estimating multiphase flow performance in multiple geological realizations. This approach performs a number of single-phase flow simulations and uses K-means clustering to select only a few representatives on which multiphase flow simulations are performed. In addition, an empirical model is then employed to describe the relationship between the single-phase solution and the multiphase solution using these representatives. Finally, the multiphase solution in all realizations can be easily predicted using empirical models. The method is applicable to both 2D and 3D synthetic models and has been shown to perform well in the trusted interval of productivity, and probability distribution as indicated by the cumulative density function. It is able to capture a large number of ensemble statistical realizations of Monte Carlo simulation results with significantly reduced computational cost.

How to cite: Liao, Q.: Clustering aggregation model for statistical forecasting of multiphase flow problems, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-973, https://doi.org/10.5194/egusphere-egu23-973, 2023.

EGU23-3891 | Posters on site | HS3.5

Estimating groundwater response time in humid climate by using spectral analysis

Mariaines Di Dato, Timo Houben, and Sabine Attinger

During dry periods, river flow comprises baseflow, which typically generates from shallow aquifers. Understanding how such aquifers respond to climate events is key to managing environmental issues related to water supply or water quality. A typical indicator of groundwater response to climate events is the characteristic response time, which indicates the rate of depletion of shallow aquifers.

The traditional method to infer the characteristic response time analyzes the slope of the hydrograph recession curve. Such a method does not account for stormwater contribution in recession analysis, thereby assuming that the catchment is dry and the only contribution to discharge originates from groundwater. As a consequence, the recession analysis might underestimate the groundwater response time, owing to the presence of faster discharge components, i.e. surface runoff or interflow, in the falling limbs.

In this work, we propose an alternative methodology to calculate the characteristic response time, which is determined by analyzing the behavior of the baseflow time series in the frequency domain. The aquifer can be conceptualized as a low-pass filter, which smooths the high-fluctuating components in the recharge signal. Such behavior causes a cut-off frequency in the baseflow spectrum, which corresponds to the aquifer characteristic time. We applied this approach to several gauging stations in Germany, whose humid climate is ideal to compare the results with the classical recession analysis.

We observed that spectral analysis yields characteristic response times systematically larger than the ones calculated with recession analysis. On average there is a factor of two between the estimates provided by the two methods. Overall our study emphasizes careful consideration of the estimation of groundwater response times, especially in humid and sub-humid river basins.

How to cite: Di Dato, M., Houben, T., and Attinger, S.: Estimating groundwater response time in humid climate by using spectral analysis, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3891, https://doi.org/10.5194/egusphere-egu23-3891, 2023.

EGU23-3933 | Posters on site | HS3.5

Towards identification of dominant hydrological mechanisms in ungauged catchments

Cristina Prieto, Le Vine Nataliya, Kavetski Dmitri, Fenicia Fabrizio, Scheidegger Andreas, and Vitolo Claudia

Modelling hydrological processes in ungauged catchments is a major challenge in environmental sciences and engineering. An ungauged catchment is a catchment that lacks streamflow data suitable for traditional modelling methods. Predicting streamflow in ungauged catchments requires some form of extrapolation ("regionalisation") from other "similar" catchments, with variables of interest being flow "indices" or "signatures", such as quantiles of the flow duration curve, etc.

Another major question in hydrology is the estimation of model structure that reflects the hydrological processes relevant to the catchment of interest. This question is intimately tied to process representation. To paraphrase a common saying, all models are wrong, but some model mechanisms (process representations) might be useful. Our previous study contributed a Bayesian framework for the identification of individual model mechanisms from streamflow data.

In this study we extend the mechanism identification method to operate in ungauged basins based on regionalized flow indices. Candidate mechanisms and model structures are generated, and then the "dominant" (more a posterior probable) model mechanisms are identified using statistical hypothesis testing. As part of the derivation, it is assumed that the error in the regionalization of flow indices dominates the structural error of the hydrological model.

The proposed method is illustrated with real data and synthetic experiments based on 92 catchments from northern Spain, from which 16 catchments are treated as ungauged. We use 624 model structures from the flexible hydrological model framework FUSE. Flow indices are regionalised using random forest regression in principal component (PC) space; we select the first 4 leading indices in PC space. The case study set up includes an experiments using real data (where the true mechanisms are unknown) and a set of synthetic experiments with different error levels (where the “true” mechanisms are known).

Across the real and synthetic experiments, routing is usually among the most identifiable processes, whereas the least identifiable processes are percolation and unsaturated zone processes. The precision, i.e. the probability of making an identification (whether correct or not), remains stable at around 25%. In the synthetic experiments we can calculate the (conditional) reliability of the identification method, i.e. the probability that, when the method makes an identification, the true mechanism is identified. The conditional reliability varies from 60% to 95% depending on the magnitude of the combined regionalization and hydrological error. Our study contributes perspectives on hydrological mechanism identification under data-scarce conditions; we discus limitations and opportunities for improvement.

Prieto, C., N. Le Vine, D. Kavetski, F. Fenicia, A. Scheidegger, and C. Vitolo (2022) An Exploration of Bayesian Identification of Dominant Hydrological Mechanisms in Ungauged Catchments, Water Resources Research, 58(3), e2021WR030705, doi: https://doi.org/10.1029/2021WR030705.

How to cite: Prieto, C., Nataliya, L. V., Dmitri, K., Fabrizio, F., Andreas, S., and Claudia, V.: Towards identification of dominant hydrological mechanisms in ungauged catchments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3933, https://doi.org/10.5194/egusphere-egu23-3933, 2023.

EGU23-5100 | ECS | Orals | HS3.5

Characterising errors using satellite metadata for eco-hydrological modelling

Hui Zou, Lucy Marshall, and Ashish Sharma

Understanding the origin of errors in model predictions is a critical element in hydrologic model calibration and uncertainty estimation. While there exist a variety of plausible error sources, only one measure of the total residual error can be ascertained when the observed response is known. Here we show that collecting extra information a priori to characterise the data error before calibration can assist in improved model calibration and uncertainty estimation. A new model calibration strategy using the satellite metadata information is proposed as a means to inform the model prior, and subsequently to decompose data error from total residual error. This approach, referred to as Bayesian ecohydrological error model (BEEM), is first examined in a synthetic setting to establish its validity, and then applied to three real catchments across Australia. Results show that 1) BEEM is valid in a synthetic setting, as it can perfectly ascertain the true underlying error; 2) in real catchments the model error is reduced when utilizing the observation error variance as added error contributing to total error variance, while the magnitude of total residual error is more robust when utilizing metadata about the data quality proportionality as the basis for assigning total error variance ; 3) BEEM improves model calibration by estimating the model error appropriately and estimating the uncertainty interval more precisely. Overall, our work demonstrates a new approach to collect prior error information in satellite metadata and reveals the potential for fully utilizing metadata about error sources in uncertainty estimation.

How to cite: Zou, H., Marshall, L., and Sharma, A.: Characterising errors using satellite metadata for eco-hydrological modelling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5100, https://doi.org/10.5194/egusphere-egu23-5100, 2023.

EGU23-5635 | ECS | Orals | HS3.5

Spectral analysis of groundwater level time series reveals hydrogeological parameters

Timo Houben, Mariaines Di-Dato, Christian Siebert, Thomas Kalbacher, Thomas Fischer, and Sabine Attinger

Groundwater resources are heavily exploited to supply domestic, industrial and agricultural water consumption. Climate and societal changes and associated higher abstraction will alter the subsurface storage in terms of quantity and quality in currently unpredictable ways. In order to ensure sustainable groundwater management, we must evaluate the intrinsic and spatially variable vulnerability of aquifers in terms of water quality issues and the resilience of groundwater volumes to external perturbations such as severe droughts in connection with intensive irrigation. For this purpose, physically based numerical groundwater models are of great importance, especially on the regional scale. The equations applied in these models must be fed with the hydrogeological parameters: The transmissivity T and the storativity S.

Both parameters are typically obtained through time consuming and cost intensive hydrogeological in-situ tests or by laboratory analysis of core samples from point information (drillings and wells), resulting in parameters with limited transferability to regional settings. Instead, we propose to determine the parameters by spectral analysis of groundwater level fluctuations using (semi-)analytical solutions for the frequency domain. We developed a fully automatized workflow, taking groundwater level and recharge time series together with little information about the geometry of the aquifer to derive T and S as well as t_c (the characteristic response time). While the first two will be used for hydrogeological modelling, the latter can serve as an indication to assess the resilience of the groundwater system directly without additional modelling attempts. The methodology was tested with great success in simplified numerical environments and was applied to real groundwater time series in southern Germany. The response times and the storativities could be robustly estimated while the transmissivities inherit quantifiable uncertainties. Depending on the hydrogeological regime, the parameters represented effective and regional estimates.

How to cite: Houben, T., Di-Dato, M., Siebert, C., Kalbacher, T., Fischer, T., and Attinger, S.: Spectral analysis of groundwater level time series reveals hydrogeological parameters, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5635, https://doi.org/10.5194/egusphere-egu23-5635, 2023.

EGU23-6986 | Orals | HS3.5

On the elaboration of a robust calibration strategy for the large-scale GEM-Hydro model

Etienne Gaborit, Daniel Princz, Juliane Mai, Hongren Shen, Bryan Tolson, and Vincent Fortin

As part of the Great-Lakes Runoff Inter-comparison Project (GRIP-GL; Mai et al., 2022), which aims at comparing the performances of different hydrologic models over the Great-Lakes when calibrating them using the same meteorological inputs and geophysical databases, the GEM-Hydro hydrologic model used at Environment and Climate Change Canada (ECCC) to perform operational hydrologic forecasts was calibrated using different strategies. Following the calibration work related to GRIP-GL, progress has been achieved with regard to improving the calibration of the GEM-Hydro model.

The work presented here focuses on improvements achieved with regard to calibrating the GEM-Hydro model, compared to the default version of the model and to the performances obtained during the GRIP-GL project. For various reasons explained, the GEM-Hydro calibration performed as part of GRIP-GL was suboptimal. The general calibration framework remains the same as in GRIP-GL, for example by using the MESH-SVS-Raven model to speed-up simulation times and transferring the calibrated parameters into GEM-Hydro afterwards, by relying on global calibrations for each of the 6 Great-Lakes subdomains, etc. However, several important changes have been made compared to the work performed in GRIP-GL, like a new approach to represent the effect of Tile Drains, changing the set of flow stations used for calibration, revising the objective function, etc.

The proposed calibration methodology updates significantly improve GEM-Hydro streamflow performance across the Great-Lakes domain and in addition also improve or maintain similar performance levels as the default version of the model, with respect to auxiliary variables and surface fluxes: snow, soil moisture, evapotranspiration, 2m air temperature and dew point. Indeed, the model relies on 40m atmospheric forcings for wind speed, temperature and humidity, and simulates its own 2m atmospheric variables. To achieve this, it was necessary to constrain some parameter interval values during calibration, in order to prevent the calibration algorithm to choose physically-irrelevant parameter values that could allow to improve streamflow performances while degrading other hydrologic variables, due to equifinality.

Reference:

Mai, J., Shen, H., Tolson, B. A., Gaborit, E., Arsenault, R., Craig, J. R., Fortin, V., Fry, L. M., Gauch, M., Klotz, D., Kratzert, F., O'Brien, N., Princz, D. G., Rasiya Koya, S., Roy, T., Seglenieks, F., Shrestha, N. K., Temgoua, A. G. T., Vionnet, V., and Waddell, J. W. (2022). The Great Lakes Runoff Intercomparison Project Phase 4: The Great Lakes (GRIP-GL). Hydrol. Earth Syst. Sci., 26, 3537–3572. Highlight paper. Accepted Jun 10, 2022. https://doi.org/10.5194/hess-26-3537-2022

How to cite: Gaborit, E., Princz, D., Mai, J., Shen, H., Tolson, B., and Fortin, V.: On the elaboration of a robust calibration strategy for the large-scale GEM-Hydro model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6986, https://doi.org/10.5194/egusphere-egu23-6986, 2023.

EGU23-8423 | Orals | HS3.5

Time-varying sensitivity analysis across different hydrological model structures, variables and time scales

Björn Guse, Anna Herzog, Stephan Thober, Diana Spieler, Lieke Melsen, Jens Kiesel, Maria Staudinger, Paul Wagner, Ralf Loritz, Sebastian Müller, Michael Stölzle, Larissa Scholz, Justine Berg, Tobias Pilz, Uwe Ehret, Doris Düthmann, Tobias Houska, Sandra Pool, and Larisa Tarasova and the other members of the DFG Scientific network IMPRO

Temporal sensitivity analyses can be used to detect dominant model parameters at different time steps (e.g. daily or monthly) providing insights on their temporal patterns and reflecting the temporal variability in dominant hydrological processes. However, hydrological processes do not only vary in time under different hydrometeorological conditions, but also the time scales of implemented processes are different. Here, the impact of different time scales (e.g. daily vs. monthly) on sensitivity patterns is investigated.

A temporal parameter sensitivity analysis is applied to three hydrological models (HBV, mHM and SWAT) for nine catchments in Germany. These catchments represent the variability of landscapes in Germany and are dominated by different runoff generation processes. In addition to discharge, further model fluxes and states such as evapotranspiration or soil moisture are used as target variables for the sensitivity analysis.

To analyse the impact of different time scales, two approaches are compared. In a first approach, daily simulated time series are used for the sensitivity analysis and aggregated then to monthly averaged sensitivities (Post-Agg). In a second approach, the simulated time series is first aggregated to a monthly time series and than used as input for the sensitivity analysis (Pre-Agg).

Our analysis shows that monthly averaged sensitivity patterns of different model outputs vary between Post- and Pre-Aggregation approach. Model parameters that are related to fast-reacting runoff processes, e.g. surface runoff or fast subsurface flow, are more sensitive when using daily time series for the sensitivity analysis (Post-Agg). In contrast, model parameters related processes with longer time scales such as snowmelt or evapotranspiration are more emphasized in monthly time series (Pre-Agg). These differences in the sensitivity results between Post-Agg and Pre-Agg are in particularly pronounced when using the integrated value of discharge as the target variable. Instead, the differences are smaller when applying the sensitivity analysis directly to represent model fluxes.

Moreover, our analysis shows changes in dominant parameters along a north-south gradient which can be explained by the physiographic characteristics of the catchments. The differences in the sensitivity results between the models can be related to the different model structures.

Based on our analysis, we recommend to either using model outputs of the major hydrological variables or different time scales for the sensitivity analysis to derive the maximum information from the diagnostic model analysis and to understand how model parameters describe hydrological systems.

How to cite: Guse, B., Herzog, A., Thober, S., Spieler, D., Melsen, L., Kiesel, J., Staudinger, M., Wagner, P., Loritz, R., Müller, S., Stölzle, M., Scholz, L., Berg, J., Pilz, T., Ehret, U., Düthmann, D., Houska, T., Pool, S., and Tarasova, L. and the other members of the DFG Scientific network IMPRO: Time-varying sensitivity analysis across different hydrological model structures, variables and time scales, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8423, https://doi.org/10.5194/egusphere-egu23-8423, 2023.

EGU23-10001 | Posters on site | HS3.5

Investigating the spectral analysis of groundwater level fluctuations in a numerical model of the upper Danube catchment in Germany

Rao Ali Javed, Timo Houben, Thomas Kalbacher, and Sabine Attinger

Common in-situ methods like pumping tests, slug tests and laboratory analysis reveal aquifer parameters (that is the transmissivity and storativity) that are localized and specific to the measurement location. A need for regionally valid aquifer parameters arises when setting up regional scale physically based groundwater models. The models would help water resource managers to plan and predict the quality and quantity of groundwater resources, thus supports decision making as well as sustainable fresh water supply. A study from Houben et al. 2022 indicate that regional aquifer parameters can be obtained by analysing the frequency content of groundwater level time-series. Their work builds upon a semi-analytical solution for the groundwater head spectrum stochastically derived from the Boussinesq equation evoking the Dupuit assumptions. They found that the solution can be used to infer the transmissivity and storativity from groundwater level fluctuations and validated their hypothesis in simplified numerical environments of different complexity.

In this work, we extended the numerical experiments and applied the semi-analytical solution in homogeneous and heterogeneous 2D (x-y-plane) aquifers as well as in a complex numerical 2D (x-y-plane) model of the upper Danube catchment. We tested the hypothesis that certain locations can reveal regional aquifer parameters. In a homogeneous simulated model, the semi-analytical solution reveals effectively the model input parameters which serves as a proof-of-concept. In a heterogeneous numerical model, the obtained parameters show the complex interplay between zones of different permeability. The effects of high permeable zones can be observed on the low permeable zones which are further apart and vice versa. The obtained parameters were in the range of the model input parameters and followed the trend of the input parameters along the direction of flow. In the model of the upper Danube the obtained parameters were systematically larger than the input parameters. The shift in the obtained parameters was attributed to a violation of the assumptions of the semi-analytical solution. Thus, the complexity of model leads to a breakdown of the semi-analytical solution in some areas. Analyses on a sub-catchment scale revealed that when the assumptions of the analytical solution are met, the obtained parameters reflect the effective parameters.

How to cite: Javed, R. A., Houben, T., Kalbacher, T., and Attinger, S.: Investigating the spectral analysis of groundwater level fluctuations in a numerical model of the upper Danube catchment in Germany, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10001, https://doi.org/10.5194/egusphere-egu23-10001, 2023.

EGU23-10326 | ECS | Orals | HS3.5

New Diagnostic Assessment of MCMC Algorithms Effectiveness, Efficiency, Reliability, and Controllability in Calibrating Hydrological Models

Hossein Kavianihamedani, Julianne Quinn, and Jared Smith

Hydrologic models often are used to estimate streamflows at ungauged locations for infrastructure planning. These models can contain a multitude of parameters that themselves need to be estimated through calibration. Yet multiple sets of parameter values may perform nearly equally well in simulating flows at gauged sites, making these parameters highly uncertain. Markov Chain Monte Carlo (MCMC) algorithms can quantify parameter uncertainties; however, this can be computationally expensive for hydrological models. Thus, it is important to select an MCMC algorithm that is effective (converges to the true posterior parameter distribution), efficient (fast), reliable (consistent across random seeds) and controllable (insensitive to the algorithms hyperparameters). These characteristics can be assessed through algorithm diagnostics, but current MCMC diagnostics mostly focus on evaluating convergence of an individual search process, not diagnosing general problems of the algorithms. Therefore, additional diagnostics are required to represent algorithms sensitivity to their hyperparameters and to compare their performance across problems.

Here, we propose new diagnostics to assess the effectiveness, efficiency, reliability and controllability of four MCMC algorithms: Adaptive Metropolis, Sequential Monte Carlo, Hamiltonian Monte Carlo, and DREAM(ZS). The diagnostic method builds off of diagnostics used to assess the performance of Multi-Objective Evolutionary Algorithms (MOEAs), and allows us to evaluate the sensitivity of the algorithms to their hyper-parameterization and compare their performance on multiple metrics, such as the Gelman-Rubin diagnostic and Wasserstein distance from the true posterior. We illustrate our diagnostics using the simple Hydrological Model (HYMOD) and several analytical test problems. This allows us to see which algorithms perform well on problems with different characteristics (e.g. known vs. unknown posterior shapes, uni- vs. multi-modality, low- vs. high-dimensionality). Since posterior shapes and modality are often unknown for hydrological problems, it is important to calibrate them with an MCMC algorithm that is robust across a wide variety of posterior shapes, and our new diagnostics allow for this identification.

How to cite: Kavianihamedani, H., Quinn, J., and Smith, J.: New Diagnostic Assessment of MCMC Algorithms Effectiveness, Efficiency, Reliability, and Controllability in Calibrating Hydrological Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10326, https://doi.org/10.5194/egusphere-egu23-10326, 2023.

EGU23-10510 | ECS | Posters on site | HS3.5

Uncertainty Quantification in Hydrological and Environmental Modeling based on Polynomial Chaos Expansion

Zoe Li, Pengxiao Zhou, and Maysara Ghaith

There are significant uncertainties associated with the estimates of model parameters in hydrological and environmental modeling. Such uncertainties could propagate within a modeling framework, leading to considerable deviation of the predicted value from its real value. Quantifying the uncertainties associated with model parameters could be computationally exhaustive and is still a daunting challenge to hydrological and environmental engineers. In this study, a series of Polynomial Chaos Expansion (PCE) methods, which have a significant advantage in computational efficiency, is developed to assess the propagation of parameter uncertainty. The proposed approaches were applied to two hydrological/environmental modeling case studies. The uncertainty quantification results will be compared with those from the traditional Monte Carlo simulation technique, to demonstrate the effectiveness and efficiency of the proposed approaches. This work will provide an efficient and reliable alternative to assess the impacts of the parameter uncertainties in hydrological and environmental modeling.

How to cite: Li, Z., Zhou, P., and Ghaith, M.: Uncertainty Quantification in Hydrological and Environmental Modeling based on Polynomial Chaos Expansion, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10510, https://doi.org/10.5194/egusphere-egu23-10510, 2023.

EGU23-10644 | Orals | HS3.5

Impact of Model Parameters on Runoff Sensitivities in the Community Land Model: A Study on the Upper Colorado River Basin

Yadu Pokhrel, Ahmed Elkouk, Lifeng Luo, Liz Payton, Ben Livneh, and Yifan Cheng

Understanding how land surface models (LSMs) partition precipitation into evapotranspiration and runoff under changing climate is key to improved future hydrologic predictions. This sensitivity is rarely tuned in land models, as evidenced by prevalent biases in the sensitivity of simulated runoff to precipitation and temperature change compared to observational estimates. Here, using the Community Land Model (CLM5) over the Colorado River basin (CRB), we investigate what the informative model parameters for runoff sensitivities are and how their choices affect the sensitivities under changing temperature and precipitation. We focus on the headwater region of the CRB, motivated by inconsistent model estimates of runoff sensitivities in the region and the critical need to better understand runoff changes to address the ongoing water crises in the CRB. In each headwater basin, a set of informative parameters were identified through parameter perturbations using “one at a time” method within an adaptive surrogate-based model optimization scheme (ASMO). Results of perturbations highlight that different parameter sets with similar performance (with respect to water-year discharge) provide very different runoff sensitivities to temperature and precipitation during the 1951-2010 period. Additionally, both precipitation and temperature sensitivities of runoff show sensitivity to similar parameters across the region. The most sensitive parameters control the conductance-photosynthesis relationship, soil surface resistance for direct evaporation, the partitioning of runoff into the surface and the subsurface component, and soil hydraulic properties. We show how the importance of each parameter varies through the parameter space and derive parameter estimates by maximizing the “fit to observed sensitivities” within the ASMO scheme. Our results provide key insights regarding parameters optimization to improve long-term hydrologic sensitivities in LSMs.

How to cite: Pokhrel, Y., Elkouk, A., Luo, L., Payton, L., Livneh, B., and Cheng, Y.: Impact of Model Parameters on Runoff Sensitivities in the Community Land Model: A Study on the Upper Colorado River Basin, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10644, https://doi.org/10.5194/egusphere-egu23-10644, 2023.

EGU23-11129 | ECS | Orals | HS3.5

Pitfalls and Opportunities in the Use of Markov-Chain Monte Carlo Ensemble Samplers for Vadose Zone Model Calibration

Giuseppe Brunetti, Jiri Simunek, Thomas Wöhling, and Christine Stumpp

Bayesian inference has become the most popular approach to uncertainty assessment in vadose zone hydrological modeling. By combining prior information with observations and model predictions, it became popular among hydrologists as it enables them to infer parameter posterior distributions, verify model adequacy, and assess the model's predictive uncertainty. In particular, the posterior distribution is frequently the variable of interest for modelers as it describes the epistemic uncertainty of model parameters conditioned on measurements. Gradient-free Markov-Chain Monte Carlo (MCMC) ensemble samplers based on Differential Evolution (DE) or Affine Invariant (AI) strategies have been used to approximate the posterior distribution, which is frequently anisotropic and correlated in vadose zone-related problems. However, a rigorous benchmark of different MCMC algorithms to provide guidelines for their application in vadose zone hydrological model calibration is still missing. In this study, we elucidate the behavior of MCMC ensemble samplers by performing an in-depth comparison of four samplers that use AI moves or DE-based strategies to approximate the target density. Two Rosenbrock distributions, and one synthetic and one actual case study focusing on the inverse estimation of soil hydraulic parameters using HYDRUS-1D, are used to compare algorithms in different dimensions. The analysis reveals that AI-based samplers are immune to affine transformations of the target density, which instead double the autocorrelation time for DE-based samplers. This behavior is reiterated in the synthetic scenario, for which AI-based algorithms outperform DE-based strategies. However, this performance gain disappears when the number of soil parameters increases from 7 to 16, with both samplers exhibiting poor acceptance rates, which are not improved by increasing the number of chains from 50 to 200 or by mixing different strategies.

How to cite: Brunetti, G., Simunek, J., Wöhling, T., and Stumpp, C.: Pitfalls and Opportunities in the Use of Markov-Chain Monte Carlo Ensemble Samplers for Vadose Zone Model Calibration, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11129, https://doi.org/10.5194/egusphere-egu23-11129, 2023.

EGU23-13104 | Posters on site | HS3.5

Combining water and pesticide data with coupled surface/subsurface hydrological modeling to reduce its uncertainty.

Claire Lauvernet, Claudio Paniconi, Emilie Rouzies, Laura Gatel, and Antoine Caisson

In small agricultural catchments over Europe, intensive use of pesticides leads to widespread contamination of rivers and groundwater, largely due to hydraulic transfers of these reactive solutes from plots to rivers. These transfers must be better understood and described in the watershed in order to be able to propose best management practices adapted to the catchment and to reduce its contamination. The physically based model CATHY simulates interactions between surface and subsurface hydrology and reactive solute transport. However, the high sensitivity of pesticide transfers to spatially heterogeneous soil properties induces uncertainty that should be quantified and reduced. In situ data on pesticides in a catchment are usually rare and not continuous in time and space. Likewise, satellite imagery can provide spatial observations of hydrologic variables but not generally of pesticide fluxes and concentrations, and at limited scale and time frequency. The objective of this work is to combine these 3 types of information (model, in situ data, images) and their associated errors with data assimilation methods, in order to reduce pesticide and hydrological variable uncertainties. The sensitivity to spatial density and temporal frequency of the data will be evaluated, as well as the coupled data assimilation efficiency, i.e., the effect of assimilating hydrological data on pesticide-related variables. The methods will be developed using a Python package, and compared/evaluated on twin experiments using virtual data that are however generated over a real vineyard catchment, in Beaujolais, France, in order to ensure realism of the experiments, data, and associated errors.

How to cite: Lauvernet, C., Paniconi, C., Rouzies, E., Gatel, L., and Caisson, A.: Combining water and pesticide data with coupled surface/subsurface hydrological modeling to reduce its uncertainty., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13104, https://doi.org/10.5194/egusphere-egu23-13104, 2023.

EGU23-13589 | Orals | HS3.5 | Highlight

A comparison of sensitivity analysis methods and their value for comparing denitrification models

Jesús Carrera and Jordi Petchamé

EGU23-13981 | Posters on site | HS3.5

Sensitivity analysis of water balance components under climate change in Saxony

Niels Schuetze, Corina Hauffe, Sofie Pahner, Clara Brandes, Kan Lei, and Mellentin Udo

Catchments in Saxony differ regarding their physiographic characteristics (topography, geomorphology, geology, land use, soils, etc.) and their climatic boundaries. Both factors influence the flow behavior and the water balance components of catchments. How sensitive the water balance of catchments responds to current and future changes in the climatic boundary conditions is difficult to predict for each catchment and is associated with significant uncertainties. In Saxony, the pronounced drought in groundwater and surface water from 2018 to 2020 led to considerable regional problems in water supply and quality.

Schwarze et al. (2017) already investigated trends of the observed discharge and variables derived by hydrograph separation (e.g. baseflow) in a sensitivity study. In this presentation, we show the results of an extension of this analysis with current observation data until 2020. The following research questions are investigated: (i) Are catchments in Saxony already responding to changing climatic conditions? (ii) Which regions show the most significant changes in discharge behavior relative to other water balance components? (iii) What are the factors and drivers of changes in the water balance in Saxonian Catchments?

The study is based only on observational data for precipitation, temperature, and discharge in the period of 1961 to 2020 in Saxony. Break point analysis, hydrograph separation, and sensitivity analysis of hydrological signatures are performed for different sets of climate periods to quantify changes and elasticity of the water balance components. As a result, a decreasing trend for the mean flow can be seen for almost all 88 investigated and undisturbed catchments in Saxony. This trend is more pronounced in the mountainous regions than in the lowland of Saxony. Despite the slight increase in the mean annual precipitation, the temperature rise of about one °C from 1991-2020 compared to 1961-1990 in all catchments leads to an increasing evapotranspiration, reduced discharge, and groundwater recharge.

References:

Schwarze, R., Wagner, M. and Röhm, P. (2017). Adaptation strategies to climate change - Analysis of the sensitivity of water balance variables of Saxon gauge catchments with respect to the increased temperature level from 1988 onwards compared to the reference state of 1961-1987. Ed.: Saxon State Office for Environment, Agriculture and Geology (LfULG), 2017.

How to cite: Schuetze, N., Hauffe, C., Pahner, S., Brandes, C., Lei, K., and Udo, M.: Sensitivity analysis of water balance components under climate change in Saxony, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13981, https://doi.org/10.5194/egusphere-egu23-13981, 2023.

EGU23-15689 | ECS | Posters on site | HS3.5

Adaptive Surrogate Likelihood Function for Blended Hydrologic Models

Rezgar Arabzadeh, Jonathan Romero-Cuellar, Robert Chlumsky, James Craig, and Bryan Tolson

This abstract introduces a recipe for an adaptive general likelihood function and its application in the Bayesian epistemology of model parameters and structure uncertainty. The proposed methodology focuses on a special class of likelihood function, hereinafter mentioned as adaptive general likelihood function (AGL), which require a minimum priori assumptions/knowledge about the model residuals. The goal of the AGL is to characterize the model residuals independently from the inference framework in order to avoid incorrectly posterior estimation as a result of jointly inferencing of model and error model parameters. Mathematically, AGL is structured with a mixture of gaussian distributions joined with a first order autoregressive model, account for error model shape and autocorrelation respectively. To assess the AGL application, it is benchmarked with a formal likelihood function formulated by Schoups and Vrugt (2010) and evaluated for 24 Camels basins where the blended model has been deterministically applied with success (Chlumsky et al. 2022). Both approaches are compared with the residual’s empirical distributions using various statistical tests. The model used here is a blended hydrologic model introduced by Mai et al., (2021) which is a class of hydrologic models constructed by averaging (blending) various process options at the process flux level. This blending means calibration of the model functions to identify traditionally calibrated model process parameters as well as the weights utilized to average multiple process options. The model is deployed in the Raven hydrologic framework (Craig et al., 2020) and simultaneously both processes weights and parameters were calibrated deterministically for both high flows and low flows using PADDS algorithm (Asadzadeh and Tolson, 2013). This multi-objective calibration yields a suite of sample of calibrated blended models which is then utilized for error model development and testing. The tests results indicated a statistically comparable performance for both methods for t-distributed residuals highly skewed and long-tailed residual errors which are apparent in many hydrologic model residuals. Finally, to disjoin the epistemic Bayesian inference framework from the error model parameters, an epsilon-support vector regression (eps-SVR) is deterministically trained as a surrogate model to map the structural/parametric variability to residual error model parameters. The eps-SVR calibration performance metrics indicated high quality of surrogate for training set indicating promising performance.

How to cite: Arabzadeh, R., Romero-Cuellar, J., Chlumsky, R., Craig, J., and Tolson, B.: Adaptive Surrogate Likelihood Function for Blended Hydrologic Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15689, https://doi.org/10.5194/egusphere-egu23-15689, 2023.

EGU23-16039 | Orals | HS3.5

Complexity-based robust hydrologic prediction: extension of statistical learning theory to conceptual hydrological models

Saket Pande and Mehdi Moayeri

The applications of statistical learning theory (SLT) in hydrology have been either in the form of Support Vector Machines and other complexity regularized machine learning algorithms that learn and predict input-output patterns such as rainfall-runoff time series or of identifying optimal complexity of low order models such as k nearest neighbour models to predict hydrological time series such as streamflow. The regularization of model complexity offers a way to identify minimal complexity of a model to accurately predict a time series of interest. However such applications often assume that the modelled residual are independent of each other. This limits its application to conceptual hydrological models where residuals are often auto-correlated. This paper applies recent results of risk bounds for time series forecasting and SLT approaches to dynamical system identification to conceptual hydrological models, offering a means to identify optimal complexity of conceptual models and complexity regularised streamflow predictions based on it.

Basins from CAMELS data set are used to demonstrate the effect of regularizing the problem of hydrological model calibration on streamflow prediction over unseen data. SAC-SMA and SIXPAR (a lower order version of SACSMA) are used as model examples. Preliminary results show that prediction uncertainty bounds are narrower if regularization does not improve the performance of a calibrated model over unseen data. This effect is stronger in drier basins than in humid ones. Also, as expected, this effect is stronger when training data size is small and holds for both SACSMA and SIXPAR.

How to cite: Pande, S. and Moayeri, M.: Complexity-based robust hydrologic prediction: extension of statistical learning theory to conceptual hydrological models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16039, https://doi.org/10.5194/egusphere-egu23-16039, 2023.

EGU23-16924 | ECS | Orals | HS3.5

Does hydrologically consistent data improve model performance? The importance of closing the water balance of input and evaluation data

Jefferson S. Wong, Fuad Yassin, and James S. Famiglietti

Due to the lack of accurate representation of hydrological processes and parameter measurements, physically-based hydrological models consist of many parameters requiring calibration to historical observations so that reliable hydrological inference can be obtained. With the increasing data availability from various sources (e.g., satellite remote sensing, climate model reanalysis), additional information on different water balance components (e.g., soil moisture, groundwater storage, etc.) are used to constrain and validate hydrological models, resulting in better model performance and parameter identifiability. However, given the emergence of multiple datasets for various water budget components, and their differences in temporal and spatial resolutions, the uncertainties in these datasets, when used together in driving and evaluating hydrological models, could introduce potential inconsistencies in water balance estimation and lead to a non-closure problem, which could result in potentially biased parameter and water balance component estimates in hydrological modelling.

This study addresses this issue by examining the impact of inconsistent water balance component data on model performance and exploring the importance of hydrologically consistent data for robust hydrological inference. The assessment is done using a Canadian Hydrologic-Land Surface Models named MESH in the Saskatchewan River basin, Canada over the period of 2002 to 2016. Seven precipitation datasets, seven evapotranspiration products, one source of water storage data – GRACE from three different centers using spherical harmonic and mass concentration approaches – and observed discharge data from hydrometric stations are selected as the input and evaluation data. A reference water balance dataset is developed to optimally combine all available data sources for each water balance component and to obtain water balance closure though a constrained Kalman filter data assimilation technique. The MESH model is rerun with this reference dataset and results are assessed and compared to different combinations of input and evaluation data. Preliminary results reveal great variations of model performance in the water balance components when using different combinations of input and evaluation data and results of using the reference dataset is expected to have less biased water balance component estimates. This study aims to highlight the necessity of using a set of hydrologically consistent data before any model runs and model evaluation.

How to cite: Wong, J. S., Yassin, F., and Famiglietti, J. S.: Does hydrologically consistent data improve model performance? The importance of closing the water balance of input and evaluation data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16924, https://doi.org/10.5194/egusphere-egu23-16924, 2023.

EGU23-1668 | Posters virtual | ESSI1.7

Assessing the retrieval accuracy of SMOS soil moisture product in a Greek agricultural setting

Triantafyllia Petsini and George P. Petropoulos

Soil moisture is an important parameter of the Earth system and plays a key role in understanding soil-atmosphere interactions through energy balance and the hydrological cycle. Information on its spatiotemporal variability is of crucial importance in several research topics and applications. Remote sensing, today, provides a very promising avenue towards obtaining information on the variability of soil moisture at varying spatial and temporal resolutions. Also, currently, a number of relevant operational products is available from different satellite sensors.

The objective of the present study has been to evaluate one such product specifically that from the SMOS satellite in a typical Mediterranean setting located in Greece. In particular, this study examines the agreement of the SMOS soil moisture product with collocated field measurements from the Prefecture of Larisa for calendar year of 2020 acquired from Neuropublic S.A.. The agreement between the two datasets was evaluated on the basis of several statistical measures. Also, the effect of topographical and geomorphological features, land use/cover and the relative satellite orbit type and the Radio Frequency Interference (RFI) was examined as part of our analysis.

To our knowledge, this study is one of the few providing an insight of the SMOS soil moisture product accuracy in a Greek setting. Findings of our study can provide important insights towards understanding the practical value of such products in agricultural and arid/semi-arid Mediterranean environments such that of Greece and also help efforts directed towards improving their retrieval accuracy.

Keywords: soil moisture; operational product; remote sensing; SMOS; validation; agriculture; Mediterranean setting

How to cite: Petsini, T. and Petropoulos, G. P.: Assessing the retrieval accuracy of SMOS soil moisture product in a Greek agricultural setting, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1668, https://doi.org/10.5194/egusphere-egu23-1668, 2023.

EGU23-1761 | ECS | Posters virtual | ESSI1.7

Exploring the synergy of EnMAP hyperspectral imagery with Machine Learning for land use- land cover mapping in a Mediterranean setting

Christina Lekka, Spyridon E. Detsikas, and George P. Petropoulos

The Environmental Mapping and Analysis Program (EnMAP), is a new spaceborne German hyperspectral satellite mission for monitoring and characterizing the Earth’s environment on a global scale. EnMAP mission supports the retrieval of high-quality and abundant detailed spectral information in VNIR and SWIR ranges within a large-scale area in wide temporal coverage and high spatial resolution. Taking advantage of high-quality data freely available to the scientific community great potential is revealed in a wide range of ecological and environmental applications, such as i.e. accurate and up-to-date LULC thematic maps.

The objectives of the present study are to explore the accuracy of EnMAP in land cover mapping over a heterogeneous landscape. As a case study is used a typical Mediterranean setting located in Greece. The methodology is based on the synergistic use of machine learning techniques and ENMAP imagery coupled with other ancillary data and was carried out in EnMAP Box-3, a toolbox designed within a GIS open-source software. Validation of the derived LULC maps has been carried out using the standard error matrix approach and also via comparisons versus existing LULC operational products.

To our knowledge, this research is one of the first to explore the advantages of the hyperspectral EnMAP satellite mission in the context of LULC mapping. Results of the present study are expected to provide valuable input for applications of LULC mapping and demonstrate the potential of hyperspectral EnMAP data for improved performance and the highest accuracy of LULC mapping.

KEYWORDS: EnMAP, Land cover, Land use, Hyperspectral remote sensing, Machine Learning

How to cite: Lekka, C., Detsikas, S. E., and Petropoulos, G. P.: Exploring the synergy of EnMAP hyperspectral imagery with Machine Learning for land use- land cover mapping in a Mediterranean setting, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1761, https://doi.org/10.5194/egusphere-egu23-1761, 2023.

EGU23-1785 | ECS | Posters virtual | ESSI1.7

Long term monitoring of the changes in Impervious Surface Areas in a Greek setting using Machine Learning and Remote Sensing data: the case of Athens Greece

Katerina Dermosinoglou and George P. Petropoulos

Information on Impervious Surface Areas (ISA) is required in various studies related to the urban environment. The continuous expansion of these surfaces is being noticed in large urban centers as a result of urbanization. The development of automated methodologies for mapping the ISas using remote sensing data has experienced a great growth in recent years.

The aim of the present study is the long-term mapping of ISA changes in Athens, Greece, from 1984 to 2022, exploiting the Landsat archive and contemporary methods of geospatial data processing, such as Machine Learning. The study implementation is also carried out in Google Earth Engine cloud platform and the final results obtained are presented in a WebGIS environment.

The results of the present study can contribute to a better understanding of the urban expansions dynamics and the key drivers linked to the urban sprawl that affect cities such as Athens. Furthermore, they can serve as a reference for further development of applications related to urban environments, using machine learning techniques combined with remote sensing data.

KEYWORDS: ISA, urban sprawl, Landsat, GEE, WebGIS, Greece

How to cite: Dermosinoglou, K. and Petropoulos, G. P.: Long term monitoring of the changes in Impervious Surface Areas in a Greek setting using Machine Learning and Remote Sensing data: the case of Athens Greece, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1785, https://doi.org/10.5194/egusphere-egu23-1785, 2023.

EGU23-3434 | ECS | Orals | ESSI1.7

The benefit of textural features for SAR-based tropical forest disturbance mapping

Johannes Balling, Martin Herold, and Johannes Reiche

Cloud penetrating Synthetic Aperture Radar (SAR) imagery has proven effective for tropical forest monitoring at national and pan-tropical scales. Current SAR-based disturbance detection methods rely on identifying decreased post-disturbance backscatter values as an indicator of forest disturbances. However, these methods suffer from a major shortcoming, as they show omission errors and delayed detections for some disturbance types (e.g., logging or fires). Here, post-disturbance debris or tree remnants result in stable SAR backscatter values similar to those of stable forest. Despite fairly stable backscatter values we hypothesize that different orientation and arrangement of tree remnants lead to an increased heterogeneity of adjacent disturbed pixels. Increased heterogeneity can be quantified by textural features. We assessed six Gray-Level Co-Occurrence Matrix (GLCM) textural features utilizing Sentinel-1 C-band SAR time series. We used a pixel-based probabilistic change detection algorithm to detect forest disturbances based on each GLCM feature and compared them against forest disturbances detected using only backscatter data. We further developed a method to combine both backscatter and GLCM features to detect forest disturbances. GLCM Sum Average (SAVG) performed best out of the tested GLCM features. Omission errors were reduced of up to 36% and the timeliness of detections was improved of up to 30 days by applying the combination method of backscatter and GLCM SAVG. Test sites characterized by large unfragmented disturbance patches (e.g., large-scale clearings, fires and mining) showed the greatest spatial and temporal improvement. A GLCM kernel size of 5 leads to the best trade-off of improving timeliness of detections and reducing omission errors while not introducing commission errors. The robustness of the developed method was verified for a variety of natural and human-induced forest disturbance types in the Amazon Biome. Our results show that combined SAR-based textural features and backscatter can overcome omission errors caused by post-disturbance tree remnants. Combining textural features and backscatter can support law enforcement activities by improving spatial and temporal accuracy of operational SAR-based disturbance monitoring and alerting systems.

How to cite: Balling, J., Herold, M., and Reiche, J.: The benefit of textural features for SAR-based tropical forest disturbance mapping, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3434, https://doi.org/10.5194/egusphere-egu23-3434, 2023.

EGU23-5175 | ECS | Orals | ESSI1.7

A comparative study of SMAP and ASCAT satellite soil moisture products with cosmic-ray neutron sensing and in-situ data in a Mediterranean setting

Spyridon E. Detsikas, George P. Petropoulos, Nikos Koutsias, Dionisios Gasparatos, Vasilis Pisinaras, Heye Bogena, Frank Wendland, Frank Herrmann, and Andreas Panagopoulos

Obtaining Soil Moisture Content (SMC) over large scales is of key importance in several environmental and agricultural applications especially in the context of climate change and transition to digital farming. Remote sensing (RS) has a demonstrated capability in retrieving SMC over large areas with several operational products already available at different spatiotemporal resolutions. At the same time, cosmic-ray neutron sensing is a recently emerged approach in retrieving high temporal resolution SMC at intermediate spatial scales. The present study conducts an intercomparison between different RS-based soil moisture products, daily SMC retrievals from a cosmic-ray neutron sensor (CRNS) station and a network of in situ SoilNet wireless sensors installed at the Pinios Hydrologic Observatory ILTER site in central Greece for a time period of 2018-2019. The RS-based soil moisture products included herein are from NASA’s Soil Moisture Active Passive (SMAP) and Metop-A/B Advanced Scatterometer (ASCAT) satellite missions. The methodological workflow adopted includes standardized validation procedures employing a series of statistical measures to quantify the agreement between the different RS-based soil moisture products, CRNS-based SMC and the SoilNet ground truth data. Our study results contribute towards global efforts aiming at exploiting CRNS data in the context of soil moisture retrievals and their potential synergies with RS-based products. Furthermore, our findings provide valuable insights into assessing the capability of CRNS at retrieving more accurate SMC estimates at arid and semi-arid environments such as those found in the Mediterranean basin, while supporting also ongoing global validation efforts.

Keywords: Cosmic Ray Neutron Sensors; SMAP; ASCAT; SoilNet; Soil Moisture Content

How to cite: Detsikas, S. E., Petropoulos, G. P., Koutsias, N., Gasparatos, D., Pisinaras, V., Bogena, H., Wendland, F., Herrmann, F., and Panagopoulos, A.: A comparative study of SMAP and ASCAT satellite soil moisture products with cosmic-ray neutron sensing and in-situ data in a Mediterranean setting, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5175, https://doi.org/10.5194/egusphere-egu23-5175, 2023.

EGU23-6800 | ECS | Posters on site | ESSI1.7

High-resolution map of the upper range limit of trees over the cold and arid region, a case study in the Three-River-Source National Park, Tibetan Plateau

Jinfeng Xu, Xiaoyi Wang, Guanting Lv, and Tao Wang

The upper range limit of trees is the most conspicuous boundary on the Earth. However, the publicly available forest extent or forest cover datasets systematically underestimated sparse tree cover, which hindered our recognition of tree limit distribution and its drivers over cold and arid regions. Here, we built a three-step upscaling strategy, that integrates in situ measured vegetation types with spaceborne Light Detection and Ranging (LiDAR), microwave, and Landsat images in a Convolutional Neural Networks (CNN) classification algorithm, to develop a new map of the upper range limit of trees over the Three-River-Source National Park circa 2020 at 30 m resolution. The multi-satellite-based new products consider vertical structure information that could better detect sparse trees and better distinguish between the shrub, grass, and forest. Validation shows our result reveals high consistency with manual interpretations from Google Earth high-resolution images (R² = 0.97, slope = 0.99, ME = 18 m). Our proposed method provides a fast and effective tree limit mapping solution at the global scale.

How to cite: Xu, J., Wang, X., Lv, G., and Wang, T.: High-resolution map of the upper range limit of trees over the cold and arid region, a case study in the Three-River-Source National Park, Tibetan Plateau, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6800, https://doi.org/10.5194/egusphere-egu23-6800, 2023.

EGU23-8451 | Orals | ESSI1.7

Enhanced and gap-free Sentinel-2 reflectance data at vast scales with GEE

Emma Izquierdo-Verdiguier, Álvaro Moreno-Martínez, Jordi Muñoz-Mari, Nicolas Clinton, Francesco Vuolo, Clement Atzberger, and Gustau Camps-Valls

The presence of clouds and aerosols in satellite imagery hamper their use to monitor, observe and analyze the Earth's surface. Multisensor fusion can alleviate this problem. The HISTARFM algorithm developed by Moreno-Martinez et al. (2020) can generate monthly gap-filled reflectance data at 30 m spatial resolution by blending Landsat (30 m pixel size every 16 days) and MODIS (500 m pixel size daily) data using a bias-aware Kalman filter.

Cloud computing platforms such as Google Earth Engine (GEE) help us to efficiently process public data archives from different remote sensing data sources. Therefore, GEE allows us to adapt the HISTARFM algorithm to obtain gap-filled data at higher spatial resolution. To reduce the massive number of images involved in the process, the bias-aware Kalman filter blends the available and preprocessed HISTARFM monthly gap-filled reflectance (30 m pixel size every month) and Sentinel-2 (10 m pixel size at five days) data. The very high spatial gap-filled images provide reflectance information at feasible scales to obtain new products that improve decision-making activities in variable territories with complex topographies. Also, new derivative products (e.g. land cover maps, biophysical parameters, or phenological indicators) will provide the scientific community better understanding and monitoring of bio-geographical and ecoclimatic characteristics of the Earth.

Additionally, a reduction of the time resolution of the temporal series is manageable with this approach by linear interpolation producing five days of gap-filled reflectance Sentinel-2 data. The proposed approach shows promising preliminary results and provides gap-free reflectance Sentinel-2 images with their associated uncertainties. These results foster the development of improved near-real-time applications for crop and natural vegetation monitoring at continental scales.

How to cite: Izquierdo-Verdiguier, E., Moreno-Martínez, Á., Muñoz-Mari, J., Clinton, N., Vuolo, F., Atzberger, C., and Camps-Valls, G.: Enhanced and gap-free Sentinel-2 reflectance data at vast scales with GEE, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8451, https://doi.org/10.5194/egusphere-egu23-8451, 2023.

EGU23-8825 | ECS | Orals | ESSI1.7

PASSION: a workflow for the estimation of rooftop photovoltaic potential from satellite imagery.

Rodrigo Pueblas, Jann Weinand, Patrick Kuckertz, and Detlef Stolten

Photovoltaic (PV) and wind are currently the highest growing renewable energies according to the annual World Energy Outlook 2022. However, in the case of solar PV in Europe, this growth is mainly driven by utility-scale installations. Distributed residential generation has many benefits, such as relieving the electrical grid and an increase of self-sufficiency. One challenge of this topic is to accurately estimate the rooftop PV potential of different regions, in order to best allocate economic resources and regulate accordingly. Multiple approaches have been proposed in the past, such as infering from proxy variables like population density, automatically analyzing residential 3D point clouds or automatically analyzing satellite images. The latter has gained popularity in the recent years given the increased availability of satellite imagery and the improvement of Computer Vision methods. However, in research, the analysis of satellite imagery is impeded by the lack of transparency, reproducibility, and standardization of methods. Studies are heterogeneous, target different types of potential with redundant efforts, and are mostly not open source or using private datasets for training. This makes it challenging for users of various backgrounds to find and use the existing approaches.

For these reasons, this paper proposes a conceptual framework that describes and categorizes the tasks that need to be considered when estimating PV potential, thus creating a clear framework along which the contents of this research report can be classified. Addidionally the open source workflow PASSION is introduced, which integrates the assessment of geographical, technical and economic potentials of regions under consideration along with the calculation of surface areas, orientations and slopes of individual rooftop sections. It also includes the detection of obstacles and existing PV installations. It is based on a novel two-look approach, in which three independent models are deployed in parallel for the identification of rooftops, sections and superstructures. The three models show a mean Intersection Over Union (IoU) between classes of 0.847, 0.753 and 0.462 respectively, and more importantly show consistent results in non-selected real life samples.

How to cite: Pueblas, R., Weinand, J., Kuckertz, P., and Stolten, D.: PASSION: a workflow for the estimation of rooftop photovoltaic potential from satellite imagery., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8825, https://doi.org/10.5194/egusphere-egu23-8825, 2023.

EGU23-8916 | ECS | Orals | ESSI1.7

Separating tree systems in agricultural lands from forests using Deep learning

Wanting Yang, Daniel Ortiz Gonzalo, Xiaoye Tong, Dimitri Pierre Johannes Gominski, Martin Brandt, Ankit Kariryaa, Florian Reiner, and Rasmus Fensholt

Distinguishing trees on agricultural land from forests is essential for a better understanding of the relationship between forests and human farming activities. However, it is difficult to separate them with remote sensing imagery since they share similar canopy cover, especially on the edge of the amazon rain forest, which has a much-complicated agriculture pattern. Except for annual crops and pasture, there are also lots of agroforestry applications and shifting cultivation, which integrates many tree systems. And those tree systems are not well separated from the forest in the existing land cover map. Recent techniques allow for the mapping of single trees outside of forests, now we take the next step by identifying those diverse tree-involved systems in agricultural land. Here we aim to generate a robust, cost-efficient method to distinguish trees within agricultural land from the forest. We started our exploration from Peruvian Amazon, where the competition for land has increased in the last decades, causing possible adverse effects on livelihoods and ecosystem services. Deep learning models, data sampling, and fine-tuning strategies are tested and optimized with PlanetScope satellite imagery. Our research target is to provide a tool for separating tree systems in farmland from the forest. It can also be used as a base map to explore the dynamic of agriculture transition and its impact on livelihoods and ecosystem services.

How to cite: Yang, W., Ortiz Gonzalo, D., Tong, X., Pierre Johannes Gominski, D., Brandt, M., Kariryaa, A., Reiner, F., and Fensholt, R.: Separating tree systems in agricultural lands from forests using Deep learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8916, https://doi.org/10.5194/egusphere-egu23-8916, 2023.

EGU23-13610 | ECS | Orals | ESSI1.7

Unsupervised Segmentation Of Microwave Brightness Temperatures To Study The Changes In The Water Cycle

Vibolroth Sambath, Nicolas Viltard, Laurent Barthès, Audrey Martini, and Cécile Mallet

Due to climate change, understanding the changes in the water cycle has become a pressing issue. It is increasingly important to study prolonged periods of intense precipitation or dry spells to better manage water supply, infrastructure and agriculture. However, obtaining fine-scale precipitation data is challenging due to the intermittent nature of rain in time and space. Ground-based instruments could have mismatches between different regions due to spatial distribution, calibration, and complex topography. On the other hand, space-borne observations have uncertainties in their retrieval algorithms. This study proposes to deal directly with microwave images from space remote sensing, as this type of data makes it possible to study the evolution of the atmospheric water cycle on a global scale and with a temporal coverage of several decades by avoiding the uncertainties from retrieval methods. In recent years, convolutional neural networks have shown promising capabilities in identifying cyclones and weather fronts in large labelled climate datasets. However, these models required large labelled datasets for training and testing. The present study aims to test unsupervised segmentation approaches of microwave images, which are thus segmented into different classes. Instead of focusing only on one aspect, for example, precipitation, the obtained classes contain many physical properties. This is due to the fact that microwave brightness temperatures contain essential information relative to the atmospheric water cycle that can be used to derive many products such as rain intensity, water vapour, cloud fraction, and sea surface temperature. The unsupervised segmentation model consists of blocks of fully convolutional networks serving as feature extractors. Without labels, pseudo-targets from the feature extractors are used to train the model. The performance of the model in terms of intra-class and inter-class distances is compared with those of simpler models such as Kmeans. A major challenge in the unsupervised approach is validating and interpreting the resulting classes. Most of the obtained cluster patterns provide geographically coherent regions whose mode of variability of geophysical quantities can be highlighted. The presented study will then explore how the different classes computed by the unsupervised methods can be labelled and how the properties of the said classes change through time and space.

How to cite: Sambath, V., Viltard, N., Barthès, L., Martini, A., and Mallet, C.: Unsupervised Segmentation Of Microwave Brightness Temperatures To Study The Changes In The Water Cycle, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13610, https://doi.org/10.5194/egusphere-egu23-13610, 2023.

EGU23-14137 | ECS | Posters virtual | ESSI1.7

The Global Distribution and Trajectory of Aquaculture Ponds

Yang Xu and Lian Feng

The developments of global aquaculture ponds provide valuable socio-economic benefits in the Anthropocene epoch, also cause potential environmental and ecological impacts. However, the extent and trajectory of aquaculture ponds over the past 37 years remain unknown on a global scale. Our study maps the global distribution of aquaculture ponds over 9 periods (1984-1994, 1995-2000, and every 3 years from 2001 to 2021) based on a deep-learning method and Landsat observations. The total area of global aquaculture ponds expands from 10043.3 km² to 18779.70 km²and showed a slowing growth rate. Asia fishpond area accounts for up to 82% of the world's area. The extent of aquaculture ponds in Asia and South America have doubled in size since 1984. China, Vietnam, and Indonesia- the three countries with the largest fishpond area- exhibited the largest fishponds area at 2004-2006. Our study provides a critical basis for assessing spatial-temporal trajectory and potential influences of aquaculture ponds.

How to cite: Xu, Y. and Feng, L.: The Global Distribution and Trajectory of Aquaculture Ponds, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14137, https://doi.org/10.5194/egusphere-egu23-14137, 2023.

EGU23-14375 | ECS | Orals | ESSI1.7

Self-Supervised Contrastive Model for Flood Mapping and Monitoring on SAR Time-Series

Ritu Yadav, Andrea Nascetti, Hossein Azizpour, and Yifang Ban

Flooding is a natural disaster that has been increasing in recent years due to climate and land-use changes. Earth observations, such as Synthetic Aperture Radar (SAR) data, are valuable for assessing and mitigating the negative impacts of flooding. Cloud cover is highly correlated with flooding events, making SAR a preferable choice over optical data for flood mapping and monitoring.

Traditional methods for flood mapping and monitoring using SAR data, such as otsu and CVA, can be affected by noise, false detections due to shadows and occlusions, and geometric distortions. While automatic thresholding can be effective with these methods, manual adjustment of the threshold is often required to produce an accurate change map.

Supervised deep learning methods using large amounts of labeled data could potentially improve the accuracy of flood mapping and monitoring. We have a large amount of earth observation data, but the availability of labeled data is limited and labeling data is time-consuming and requires domain expertise. On the other hand, Supervised model training on small datasets causes severe generalizability issues when inference is taken on a new site.

To address these challenges, we propose a novel self-supervised method for mapping and monitoring floods on Sentinel-1 SAR time-series data. We propose a probabilistic model trained on unlabeled data using self-supervised techniques, such as reconstruction and contrastive learning. The model is trained to learn the spatiotemporal features of the area. It monitors the changes by comparing the latent feature distribution at each time stamp and generates change maps to reflect the changes in the area.

We also propose a framework for flood monitoring that continuously monitor the area using time series data. This framework automatically detects the change point i.e. when the major change started reflecting on available SAR data. Our continuous monitoring framework combined with a better temporal resolution (better than Sentinel-1) can potentially detect flood events in an early stage, allowing for more time for evacuation planning.

The model is evaluated on nine recent flood events from ‘Mekong’, ‘Somalia’, ‘Scotland’, ‘Australia’, ‘Bosnia’, ‘Germany’, ‘Spain’, ‘Bolivia’, and ‘Slovakia’ sites. We compared our results with traditional methods, and existing supervised and unsupervised methods. Our detailed evaluation indicates that our model is more accurate and generalizable to new sites. The model achieves an average Intersection Over Union (IoU) value of 70% and an F1 score of 81.14%, which are both higher than the scores of the previous best-performing method. Overall, our proposed model’s improvement range from 7-26% in terms of F1 and 8-31% in terms of IoU score.

How to cite: Yadav, R., Nascetti, A., Azizpour, H., and Ban, Y.: Self-Supervised Contrastive Model for Flood Mapping and Monitoring on SAR Time-Series, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14375, https://doi.org/10.5194/egusphere-egu23-14375, 2023.

EGU23-15020 | ECS | Orals | ESSI1.7

Transfer Learning for LULC Classification on multi-modal data in the Amazon Basin

Maximilian Hell, Melanie Brandmeier, and Andreas Nüchter

Mapping of Land use and land cover (LULC) changes over time requires automated processes and has been investigated using various machine learning algorithms and, more recently, deep learning models for semantic classification. New applications of these models to different satellite data and areas are regularly published. However, studies on the transfer of these models to other data and study areas are rather scarce. In a previous study [1], we used multi-modal and –temporal Sentinel data for LULC classification using traditional and novel deep learning models. The data covered parts of the Amazon basin and was comprised of a twelve-month time series of radar imagery (Sentinel-1), combined with a singular multi-spectral image (Sentinel-2). All satellite images were captured throughout the year 2018. The label map (Collection 4) of the Amazon produced by the MapBiomas project [2] was used as training and test labels. Besides state-of-the-art models, we developed five variations of a deep learning model—DeepForest—which leverages on the multi-temporal and -modal aspect of the data. The best model variation (DF1c) reached an overall accuracy of 74.4% on the test data.

Currently we are investigating the transferability of these models to more recent data of the same region. The new dataset was processed in the same way as in the previous study. It comprises a Sentinel-1 time-series and a single Sentinel-2 images from 2020, with an updated version of the label map of the MapBiomas project (Collection 6). This posed some challenges, as the classification scheme changed and is not fully backwards compatible with the one used to train the DeepForest models. A test dataset was chosen in the state of Mato Grosso, as the satellite scenes cover most classes used in the classification scheme. However, this data exhibits some class imbalance, as two of the eleven classes are dominating the scene. All five DeepForest variations reached accuracies higher than 79% and thus generalize well on the major LULC classes. For comparison and to further improve our models, we currently retrain the models on the new, larger data set (114,376 training image tiles compared to 18,074). Preliminary results will be shown during the session.

References

Cherif, E.; Hell, M.; Brandmeier, M. DeepForest: Novel Deep Learning Models for Land Use and Land Cover Classification Using Multi-Temporal and -Modal Sentinel Data of the Amazon Basin. Remote Sensing 2022, 14, 5000, doi:10.3390/rs14195000.
MapBiomas Brasil; Available online: https://mapbiomas.org/en

How to cite: Hell, M., Brandmeier, M., and Nüchter, A.: Transfer Learning for LULC Classification on multi-modal data in the Amazon Basin, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15020, https://doi.org/10.5194/egusphere-egu23-15020, 2023.

EGU23-17425 | Orals | ESSI1.7

Spatiotemporal mapping of riparian vegetation through multi-sensor data fusion and deep learning techniques

Huiran Jin and Xiaonan Tai

Riparian ecosystems are biodiversity hotspots and provide crucial services to human wellbeing. Currently, the knowledge of how riparian ecosystems respond to and in turn influence the variations of the environment remains considerably limited. As a first step toward filling the gap, this research aims to characterize the dynamics of riparian vegetation during the past several decades across multiple aquatic sites operated by the National Ecological Observatory Network (NEON) of the US. Specifically, it leverages high-resolution hyperspectral and lidar data collected by NEON’s airborne observational platform (AOP) surveys, the long-term records of satellite optical and radar imagery, and advanced data fusion and classification techniques to generate a time-series record of riparian vegetation on a seasonal-to-yearly basis. The maps derived will provide a new basis for understanding how riparian vegetation has changed across continental US, and for predicting how it is likely to change in the future. This work is sponsored by NSF’s Macrosystems Biology and NEON-Enabled Science (MSB-NES) Program (2021/9–2024/8), and the overarching goal of the project is to mechanistically link riparian vegetation dynamics to hydroclimate variations and assess the functional importance of riparian ecosystems to macrosystem fluxes of carbon and water.

How to cite: Jin, H. and Tai, X.: Spatiotemporal mapping of riparian vegetation through multi-sensor data fusion and deep learning techniques, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17425, https://doi.org/10.5194/egusphere-egu23-17425, 2023.

EGU23-888 | ECS | PICO | ESSI1.8

Generating self-labeled geological datasets for semantic segmentation using pretrained GANs

Ivan Ferreira and Ardiansyah Koeshidayatullah

Recent advancement in deep generative models (GANs) has brought the attention of many researchers to explore the feasibility of using realistic synthetic data as (i) a digital twin of the original dataset and (ii) a new approach to augment the original dataset. Previous works highlighted that GANs can replicate both esthetical and statistical characteristics of datasets, up to the point of being indistinguishable from real samples, even when being examined by domain experts. In addition, the weights learned during the unsupervised training of these generative models are useful to further extract specific features of interest from the given dataset. In geosciences, many computer vision tasks are related to semantic segmentation, from pore quantification to fossil characterization. In such a task, the labeling process becomes the main limiting point, being both time-consuming and requiring domain experts. Hence, in this study, we repurpose GANs to obtain self-labeled geological datasets for semantic segmentation to be readily applicable in geological machine learning workflows. In this work, we used trained style-based GANs of foraminifera specimens, ooids, and mudstones. Our experiments show that with one or a few labels, we can successfully generate self-labeled and synthetic datasets featuring the labels of interest. This achievement is pivotal in geosciences in exploring the idea of GANs for one-shot and few-shot segmentation and in minimizing the efforts of manual labeling for segmentation requiring domain experts.

How to cite: Ferreira, I. and Koeshidayatullah, A.: Generating self-labeled geological datasets for semantic segmentation using pretrained GANs, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-888, https://doi.org/10.5194/egusphere-egu23-888, 2023.

EGU23-2203 | PICO | ESSI1.8

Object detection and classification applying AI (computer vision) to underwater images

Young-Tae Son, Sang-Yeop Jin, and Tae-Soon Kang

Visual AI (artificial Intelligence) YOLOv5 algorithm was used in order to detect marine organisms from underwater images, and the test results showed an average high detection rate (>90%). As performance indicators of the AI model, both precision and recall showed very good performance, exceeding 0.95. So as to minimize the change in object detection performance according to the variation of underwater conditions, image correction was conducted, and more objects could be detected after image correction.

In order to determine which species the object detected in the video or image corresponds to, the performance was evaluated by AI learning classification model (YOLO-Classification), which is a deep learning algorithm (approximately 3% accuracy improved after image correction). We tried to identify the taxonomic species of organisms using deep learning, and although the number of target species was small, we achieved a classification accuracy of about 80% or more based on the data collected so far.

High-quality image DB data of the target species have to be established from a long-term perspective in order to accurately classify object (fish) species, and imaged taken from various angles of the target species must be collected simultaneously improve performance.

As a prerequisite for measuring the size of an object detected in an image, MDE (Monocular Depth Estimation), a deep learning algorithm for estimating the depth of a mono camera image, was applied and the distance from a certain reference point was calculated with the MiDAS v3 algorithm. As a result of the MiDas v3 algorithm test, the excessive error has been reduced compared to before application and the distance measurement accuracy of up to 2m, which is longer than the guide stick length, has been obtained.

How to cite: Son, Y.-T., Jin, S.-Y., and Kang, T.-S.: Object detection and classification applying AI (computer vision) to underwater images, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2203, https://doi.org/10.5194/egusphere-egu23-2203, 2023.

EGU23-3254 | ECS | PICO | ESSI1.8

Construction of Interactive Websites for Remote Sensing Datasets

Kai Norman Clasen and Begüm Demir

As a result of advancements in satellite technology, archives of remote sensing (RS) images are continuously growing, providing a valuable source of information for monitoring the Earth's surface. Researchers construct well-designed and ready-to-use datasets from the plethora of RS images for the broader community to make it easier to develop and compare novel algorithms, models, and architectures to further deepen our understanding of our planet from space. However, the descriptions of these datasets are often published in scientific papers as PDF files with several limitations:

The target audience is typically domain experts familiar with scientific jargon;
The work is required to adhere to a specific page limit;
Once the document is published, it is difficult to update sections or to centralize discussions around it.

To overcome these issues, here we introduce the concept of interactive dataset websites that aim at making the dataset and research based on it more accessible. With visual and interactive examples, users can see exactly how the data is structured and how the data can be used in different contexts. For example, when working with RS data, it is beneficial to get a quick overview of the geographical distribution. By providing more in-depth background information about data sources and product specifications, these websites can also help users understand the context in which the data was collected, how it might be relevant to their work, and how to avoid common pitfalls. Another important aspect of interactive dataset websites is the inclusion of example code for using, loading, and visualizing the data. Especially when working with RS images (e.g., multispectral, hyperspectral, synthetic aperture radar data, etc), it is often not trivial to visualize the data. Providing example code can be especially useful for researchers unfamiliar with the specific tools required to work with the data or to introduce to the community tools specifically written to make it easier to work with the dataset. Quick feedback can be vital, as it allows researchers to report problems or ask questions that the authors or community can address in an open and centralized manner. Creating these "living, ever-evolving documents" makes them an increasingly valuable resource for anyone working with the dataset, leading to more robust and reliable research.

It might seem daunting at first to create such an interactive dataset website, but due to recent open-source projects such as Executable Books (https://executablebooks.org/) and free hosting providers such as GitHub Pages (https://pages.github.com/), it has become relatively easy to produce and host such websites. The HTML content can be generated from Jupyter Notebooks, a tool that many researchers and data scientists are familiar with. To provide an example, in our talk we will showcase an interactive dataset website for the BigEarthNet-MM dataset, which you can find here: https://docs.kai-tub.tech/ben-docs/

How to cite: Clasen, K. N. and Demir, B.: Construction of Interactive Websites for Remote Sensing Datasets, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3254, https://doi.org/10.5194/egusphere-egu23-3254, 2023.

EGU23-3493 | PICO | ESSI1.8

OGC Testbed-18 Machine Learning Training Datasets Task: Application of standards to Machine Learning training datasets

Samantha Lavender, Caitlin Adams, Ivana Ivánová, and Kate Williams

Training datasets are a crucial component of any machine learning approach, with significant human effort spent creating and curating these for specific applications. However, a historical absence of standards has resulted in inconsistent and heterogeneous training datasets with limited discoverability and interoperability. Therefore, there is a need for best practices and guidelines for generating, structuring, describing, and curating training datasets.

The Open Geospatial Consortium (OGC) Testbed-18 initiative covered several topics related to geospatial data, focussing on issues around cataloguing and interoperability. Within Testbed-18, the Machine Learning Training Datasets task aimed to develop a foundation for future standardization of training datasets for Earth observation applications.

For this task, members from Pixalytics, FrontierSI, and Curtin University authored an Engineering Report that reviewed:
· Examples of how training datasets have been used in Earth observation applications
· The current best-practice methods for documenting training datasets
· The various requirements for training dataset metadata
· How the Findability, Accessibility, Interoperability, and Reuse (FAIR) principles apply to training datasets

The Engineering Report provides a foundation that OGC can leverage in creating the future standard for machine learning training data for Earth observation applications. The Engineering Report also provides a useful overview of the state of work and key considerations for anyone wishing to improve how they document their training datasets.

In our presentation, we discuss the key findings from the Engineering Report, including key metadata identified from Earth observation use cases, the current state of the art, thoughts on cataloguing and describing training data quality, and how the FAIR principles apply to training data.

How to cite: Lavender, S., Adams, C., Ivánová, I., and Williams, K.: OGC Testbed-18 Machine Learning Training Datasets Task: Application of standards to Machine Learning training datasets, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3493, https://doi.org/10.5194/egusphere-egu23-3493, 2023.

EGU23-5394 | ECS | PICO | ESSI1.8

AwesomeGeodataTable - Towards a community-maintained searchable table for data sets easily usable as predictors for spatial machine learning

Maximilian Nölscher, Anne-Karin Cooke, Sandra Willkommen, Mariana Gomez, and Stefan Broda

In the field of spatial machine learning, access to high-quality data sets is a crucial factor in the success of any analysis or modeling project, especially in subsurface hydrology. However, finding and utilizing such data sets can be a challenging and time-consuming process. This is where AwesomeGeodataTable comes in. AwesomeGeodataTable aims to establish a community-maintained searchable table of data sets that are easily usable as predictors for spatial machine learning starting with the focus on subsurface hydrology. With its user-friendly interface and currently small but growing number of data sets, AwesomeGeodataTable will make it easier for researchers and practitioners to find and use the data they need for their work. It brings the usability of existing data set collections to a next level through adding features for filtering and searching meta information on data sets. This talk will introduce attendees to the AwesomeGeodataTable project, its goals and features, and how they can get involved in maintaining and extending its database and expanding its features and user experience. Overall, AwesomeGeodataTable is a valuable resource for anyone working in the field of spatial machine learning, and we hope to see it become a widely used and respected resource in the community.

How to cite: Nölscher, M., Cooke, A.-K., Willkommen, S., Gomez, M., and Broda, S.: AwesomeGeodataTable - Towards a community-maintained searchable table for data sets easily usable as predictors for spatial machine learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5394, https://doi.org/10.5194/egusphere-egu23-5394, 2023.

EGU23-7299 | PICO | ESSI1.8

Towards generation of synthetic hyperspectral image datasets with GAN

François De Vieilleville, Adrien Lagrange, Nicolas Dublé, and Bertrand Le Saux

In the context of the project CORTEX, a study was carried out to build a method to generate synthetic images with associated labels for hyperspectral use cases. Such a method is interesting in the case where too few annotated data are available to train a deep neural network (DNN). The context of hyperspectral images is particularly suited for this problem since labeled datasets of hyperspectral images are scarce and generally of very small size.

Therefore, the first step of the project was to define an interesting hyperspectral use case to carry out the study. More concretely, generative models must be trained to achieve this objective. It means that a set of hyperspectral images and their associated ground truth are necessary to train the models. A dataset was created with PRISMA images associated with the IGN BD forest v2. The result is a segmentation dataset of 1268 images of size 256x256 pixels with 234 spectral bands. The associated ground truth includes 4 classes: not-forest, broad-leaved forest, coniferous forest and mixed forest. To correctly match the ground truth and the images, an important work was done about the improvement of the geolocalization of the PRISMA images by coregistering patches with Sentinel-2 images. We want to underline the interest of this database that remains from our knowledge one the few large scale HS database and is made available on the platform Zenodo.

Then, a segmentation model was trained with the dataset to assess its quality and the feasibility of the task of forest-type segmentation. Good results were obtained using a Unet-EfficientNet segmentation DNN. It showed that the dataset is coherent but the problem still difficult since the ‘mixed forest’ class remains challenging to identify.

Finally, an important research work was conducted to develop a Generative Adversarial Network method able to generate synthetic hyperspectral images. The state-of-the-art StyleGAN2 was modified to this purpose. An additional discriminator was added and tasked with the job of discriminating synthetic and real images in a reduced image space. Good results were obtained for the generation of 32-bands images, but the results worsen when increasing more the number of bands. The difficulty of the problem appears directly linked to the number of bands that we look to generate.

The final goal was to generate synthetic ground truth masks alongside the images and the method SemanticGAN was elected to address this problem. Since this method is based on StyleGAN2, the improvements of StyleGAN2 for HS images were included in the method. At the end, a modified version of SemanticGAN was proposed. The discriminator assessing the coherence between masks and images was modified to use an image of reduced dimension and a specific training strategy was introduced to help the convergence. The initial expectation was that the generation of masks would help stabilizing the generation of images, but the experiments showed the contrary. Early results are promising, but more research will be necessary to obtain couples of images and masks that could be used to train a DNN.

How to cite: De Vieilleville, F., Lagrange, A., Dublé, N., and Le Saux, B.: Towards generation of synthetic hyperspectral image datasets with GAN, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7299, https://doi.org/10.5194/egusphere-egu23-7299, 2023.

EGU23-12352 | ECS | PICO | ESSI1.8

Point-Cloud Class Separability: Identifying the Most Discriminative Features

Max Hess, Aljoscha Rheinwalt, and Bodo Bookhagen

The global availability of dense point-clouds provides the potential to better assess changes in our dynamic world, particularly environmental changes and natural hazards. A core step to make use of modern point-clouds is to have a reliable classification and identify features of importance for a successful classification. However, the quality of classification is affected by both the classifier and the complexity of the features which describe the classes. To address the limitations of classification performance, we attempt to answer the question: To what extent can a classifier learn the separation into different classes based on the available features in a given training dataset?

We compare several measures of class separability to assess the descriptive value of each feature. A ranked list is generated that includes all individual features as well as all possible combinations within specific groups. Selecting high-ranked features based on their descriptive value allows us to summarize datasets without losing essential information about the individual classes. This is an important step in processing existing training data or in setting priorities for future data collection.

In our application experiments, we compare geometric and echo-based features of lidar point-clouds to obtain the most useful sets of features for separating ground and vegetation points into their respective classes. Different scenarios of suburban and natural areas are studied to collect various insights for different classification tasks. In addition, we group features based on various attributes such as acquisition or computational cost and evaluate the benefits of these efforts in terms of a possible better classification result.

How to cite: Hess, M., Rheinwalt, A., and Bookhagen, B.: Point-Cloud Class Separability: Identifying the Most Discriminative Features, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12352, https://doi.org/10.5194/egusphere-egu23-12352, 2023.

EGU23-14061 | PICO | ESSI1.8

The Earth Observation Training Data Lab (EOTDL) - addressing training data related needs in the Earth Observation community.

Patrick Griffiths, Juan Pedro, Gunnar Brandt, Stephan Meissl, Grega Milcinski, and Laura Moreno

The availability of large training datasets (TDS) has enabled much of the innovative use of Machine Learning (ML) and Artificial Intelligence (AI) in fields such as computer vision or language processing. In Earth Observation and geospatial science/applications, the availability of TDS has generally been limited and there are a number of specific geospatial challenges to consider (e.g. spatial reference systems, spatial/spectral/radiometric/temporal resolutions). Creating TDS for EO applications commonly involves labor intensive processes and the willingness to share such datasets has been very limited. While the current open accessibility of EO datasets is unprecedented, the availability of training and ground truth datasets has not improved much over the last years, and this is limiting the potential innovative impact that new ML/AI methodologies could have in the EO domain. Next to general availability and accessibility, further challenges need to be addressed in terms of making TDS interoperable and findable and lowing the barriers for non-geospatial experts.

In the response to these challenges, ESA has initiated development of the Earth Observation Training Data Lab (EOTDL). EOTDL is being developed on top of federated European cloud infrastructure and aims to address the EO community requirements for working with TDS in EO workflows, adopting FAIR data principles and following open science best-practices.

The specific capabilities that EOTDL will support include:

Repository and Curation: host, import and maintain training datasets, ground truth data, pretrained models and benchmarks, providing versioning, tracking and provenance.
Tooling: provide a set of integrated open-source tools compatible with the major ML/AI frameworks to create, analyze and optimize TDS and to support data ingestion, model training and inference operations.
Feature engineering: Link with the main EO data archives and EO analytics platforms to support feature engineering and large-scale inference.
Quality assurance: embed QA throughout the offered capabilities, also taking advantage of automated deterministic checks and defined levels of TDS maturity.

To achieve these goals, EOTDL is building on proven technologies, such as STAC (Spatio Temporal Asset Catalog) to support data cataloguing and discoverability, openEO and SentinelHub APIs for EO data access and feature engineering, GeoDB for vector geometry and attribute handing, and EoxHub to support interactive tooling. The EOTDL functionality will be exposed via web-based GUIs, python libraries and command line interfaces.

A central objective is also the incentivization of community engagement to support quality assurance and encourage the contribution of datasets. For this award mechanisms are being established. The initial data population consists of around 100 datasets while intuitive data ingestion pipelines allow for continuous community contributions. Three defined product maturity levels are linked to QA procedures and support the trustworthiness of the data population. The development is coordinated with Radiant ML Hub to seek synergies rather than duplicate the offered capabilities.

This presentation will showcase the current development status of EOTDL and discuss in detail some key aspects such as the data curation with STAC and the adopted quality assurance and feature engineering approaches. A set of use cases that establish new TDS creation tools and result in large scale datasets are presented as well.

How to cite: Griffiths, P., Pedro, J., Brandt, G., Meissl, S., Milcinski, G., and Moreno, L.: The Earth Observation Training Data Lab (EOTDL) - addressing training data related needs in the Earth Observation community., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14061, https://doi.org/10.5194/egusphere-egu23-14061, 2023.

EGU23-16998 | PICO | ESSI1.8

The OGC Training Data Markup Language for Artificial Intelligence (TrainingDML-AI) Standard

Peng Yue, Boyi Shangguan, and Danielle Ziebelin

The development of Artificial Intelligence (AI), especially Machine Learning (ML) technology, has injected new vitality into the geospatial domain. Training Data (TD) plays a fundamental role in geospatial AI/ML. They are key items for training, validating, and testing AI/ML models. At present, open access Training Datasets (TDS) are usually packaged into public or personal file repository, without a standardized method to express its metadata and data content, making it difficult to be found, accessed, interoperated, and reused.

Therefore, based on the Open Geospatial Consortium (OGC) standards baseline, the OGC Training Data Markup Language for AI (TrainingDML-AI) Standard Working Group (SWG) tried to develop the TD model and encoding methods to exchange and retrieve TD in the Web environment. The scope includes: how TD are prepared, how to specify different metadata used for different AI/ML tasks, how to differentiate the high-level TD information model and extended information models specific to various AI/ML applications. The work will describe the latest progress and status of the standard development.

The TrainingDML-AI conceptual model includes the most relevant entities of the TD covering from dataset to individual training samples and labels. It specifies how and into which parts of the TD should be decomposed and classified. The core concepts include: AI_TrainingDataset, which represents a collection of training samples; AI_TrainingData, which is an individual training sample in a TDS; AI_Task, which identifies what task the TDS is used for; AI_Label, which represents the label semantics for TD; AI_Labeling, which provides the provenance for the TD; AI_TDChangeset, which records TD changes between two TDS versions; DataQuality, which can be associated with the TDS to document its quality.

The TrainingDML-AI content model focuses on implementations with basic attributes defined for off-the-shelf deployment. Concepts related to the EO AI/ML applications are defined as additional elements. Six key components are highlighted:

Training Dataset/Data. AI_AbstractTrainingDataset indicates the TDS, while each training sample is represented as AI_AbstractTrainingData. AI_EOTrainingDataset and AI_EOTrainingData are defined to convey attributes specific to EO domain.
AI_EOTask is proposed by extending AI_AbstractTask to represent specific AI/ML tasks in the EO domain. The task type can refer to a particular type defined by an external category.
Labels for each individual training sample can be represented using features, coverages, or semantic classes. The AI_AbstractLabel is extended to specify AI_SceneLabel, AI_ObjectLabel, and AI_PixelLabel respectively.
AI_Labeling records basic provenance information on how to create the TDS. It includes the labeler and labeling procedure, which can be mapped to the agent and activity respectively in W3C PROV.
DataQuality and QualityElements defined in the ISO 19157-1 are used to align with the existing efforts on geographic data quality.
Change procedures of the TDS are documented in the AI_TDChangeset, which composes of changed training samples in the collection level.

Finally, use case scenarios and best practices are provided to illustrate intended use and benefits of TrainingDML-AI for EO AI/ML applications. Totally five different tasks are provided, covering scene classification, object detection, semantic segmentation, change detection and 3D model reconstruction. Some software implementations including pyTDML and LuojiaSet are also presented.

How to cite: Yue, P., Shangguan, B., and Ziebelin, D.: The OGC Training Data Markup Language for Artificial Intelligence (TrainingDML-AI) Standard, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16998, https://doi.org/10.5194/egusphere-egu23-16998, 2023.

EGU23-17570 | PICO | ESSI1.8

A dataset of Earth Observation Data for Lithological Mapping using Machine Learning

Ioannis Vernikos, Georgios Giannopoulos, Aikaterini Christopoulou, Anxhelo Begaj, Marianthi Stefouli, Emmanuel Bratsolis, and Eleni Charou

Machine Learning (ML) algorithms had successfully contributed in the creation of automated methods of recognizing patterns in high-dimensional data. Remote sensing data covers wide geographical areas and could be used to solve the problem of the demand of various in-situ data. Lithologicall mapping using remotely sensed data is one of the most challenging applications of ML algorithms. In the framework of the “AI for Geoapplications” project , ML and especially Deep Learning (DL) methodologies are investigated for the identification and characterization of the lithology based on remote sensing data in various pilot areas in Greece. In order to train and test the various ML algorithms, a dataset consisting of 30 ROIs selected mainly from low -vegetated areas, that cover 2% of the total area of Greece was created . For each ROI

the corresponding shape file with the lithological units
the corresponding Sentinel2 (10 bands) and/or Aster (14 bands) images

are provided

The dataset is being publicly available in the cloud along with the necessary code for visualization and processing.

How to cite: Vernikos, I., Giannopoulos, G., Christopoulou, A., Begaj, A., Stefouli, M., Bratsolis, E., and Charou, E.: A dataset of Earth Observation Data for Lithological Mapping using Machine Learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17570, https://doi.org/10.5194/egusphere-egu23-17570, 2023.

EGU23-3441 | Posters on site | ESSI1.9

A Programming Model for Geospatial Machine-Learning with Scalability in Hybrid Multiclouds

Michiaki Tatsubori, Daiki Kimura, Takao Moriyama, Naomi Simumba, and Tatsuya Ishikawa

While deep machine learning approaches are getting pervasively used in remote sensing and modeling the earth, difficulties due to the size of satellite data are always pains for scientists in implementing such experiential software. We present a programming model for geospatial machine learning based on TorchGeo and PyTorch, which are getting the de fact standards in programming with PyTorch/Python. TorchGeo is open-sourced and designed to make it simple for remote sensing experts to explore machine learning solutions. Our objective is to allow machine-learning programs using TorchGeo to scale leveraging proprietary high-performance computing (HPC) and multicloud HPC resources, from ones notebook. One of key technologies specifically needed in geospatial machine learning is the smart integration of peta-scale data services and data-distributed parallel frameworks. We implement such a platform as a part of IBM Research Geospatial Discovery Network (GDN) and experiment segmentation tasks such as flood detection from satellite data to show its scalability.

How to cite: Tatsubori, M., Kimura, D., Moriyama, T., Simumba, N., and Ishikawa, T.: A Programming Model for Geospatial Machine-Learning with Scalability in Hybrid Multiclouds, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3441, https://doi.org/10.5194/egusphere-egu23-3441, 2023.

EGU23-3501 | ECS | Posters on site | ESSI1.9

EarthNets: An Open Deep Learning Platform for Earth Observation

Zhitong Xiong and Xiao Xiang Zhu

Earth observation (EO) data are critical for monitoring the state of planet Earth and can be helpful for various real-world applications [1]. Although numerous benchmark datasets have been released, there is no unified platform for developing and fairly comparing deep learning models on EO data [2]. For deep learning methods, the backbone networks, hyper-parameters, and training details are influential factors while comparing the performances.. However, existing works usually neglect these details and even evaluate the performance with different training/validation/test dataset splits. This makes it difficult to fairly and reliably compare different algorithms. In this study, we introduce the EarthNets platform, an open deep-learning platform for remote sensing and Earth observation. The platform is based on PyTorch [3] and TorchData. There are about ten different libraries, covering different tasks in remote sensing. Among them, Dataset4EO is designed as a standard and easy-to-use data-loading library, which can be used alone or together with other high-level libraries like RSI-Classification (for image classification), RSI-Detection (for object detection), RSI-Segmentation (for semantic segmentation), and so on. Two factors are considered for the design of the EarthNets platform: the first one is the decoupling between dataset loading and high-level EO tasks. As there are more than 400 RS datasets with different data modalities, research domains, and download links, efficient preparation of analysis-ready data can largely accelerate the research for the whole community. The other factor is to bring advances in machine learning to EO by providing new deep-learning models. The EarthNets platform provides a fair and consistent evaluation of deep learning methods on remote sensing and Earth observation data [4]. It also helps bring together the remote sensing and a larger machine-learning community. The platform, dataset collections are publicly available at https://earthnets.github.io.

[1] Zhu, Xiao Xiang, et al. "Deep learning in remote sensing: A comprehensive review and list of resources." IEEE Geoscience and Remote Sensing Magazine 5.4 (2017): 8-36.

[2]Long, Yang, et al. "On creating benchmark dataset for aerial image interpretation: Reviews, guidances, and million-aid." IEEE Journal of selected topics in applied earth observations and remote sensing 14 (2021): 4205-4230.

[3] Paszke, Adam, et al. "Pytorch: An imperative style, high-performance deep learning library." Advances in neural information processing systems 32 (2019).

[4] Xiong, Zhitong, et al. "EarthNets: Empowering AI in Earth observation." arXiv preprint arXiv:2210.04936 (2022).

How to cite: Xiong, Z. and Zhu, X. X.: EarthNets: An Open Deep Learning Platform for Earth Observation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3501, https://doi.org/10.5194/egusphere-egu23-3501, 2023.

EGU23-4160 | Posters on site | ESSI1.9

Sentinel Hub - federated on-demand ARD generation

Grega Milcinski and Primoz Kolaric

Every experiment starts with the data, which needs to be fine-tuned for the specific use-case. We call this "analysis ready data (ARD)". In some cases, for the sake of reusability and comparability, the specifications for ARD are well defined. In many other cases, however, the procedures are not yet mature enough to support standardisation. In Earth Observation (EO) field this is especially true, as the whole community is moving from (semi)manually analysing individual scenes, from the time there were any data barely available, to processing of time-series, now that Landsat and Sentinel made this possible. We are now even facing a problem where there is simply too much of data, with PBs of open and commercial imagery being readily available. With the data being distributed at different places (Copernicus Data Access Service for Sentinel, AWS for Landsat) the challenge is further magnified. Machine learning (ML) approach can address the challenge of shifting through data, but ML as well requires data to be pre-processed for purpose and made available at the place where ML is running. Therefore, it is essential to have facility, which can generate ARD data customised for the specific analysis' requirements.

Sentinel Hub (SH) is a satellite imagery processing service, which is capable of on-the-fly gridding, re-projection, re-scaling, mosaicking, compositing, orthorectification and other actions required, either for integration in web-applications, where pictures are mostly served, or in ML and similar analysis processes, where pixel values and statistics are essential. SH works with original satellite data files and does not require replication or pre-processing. It uses cloud infrastructure and innovative methods to efficiently process and distribute data in a matter of seconds. Sentinel Hub gives access to a rich collection of satellite data including a full set of Sentinel satellites, Landsat collections, commercial VHR collections and other complimentary collections. It also provides an ability for users to onboard their own data in one of the standardised formats. Furthermore, the data located at different clouds, can be fused together in one single process, benefiting from the variability and volume of different sensors.

There are two main capabilities, which make SH especially fit for purpose of generating on-demand ARD data. First one is the support for user-provided processing scripts, which are a set of recipes on what should happen with the sensor data (band composites, indices, even simple neural networks combining available data). The second one is a set of processing orchestration options. There is a Process API for immediate, access to the pixel values. Statistical API is optimised for time-series analysis, which aggregates the data over specific area of interest and provides configurable statistics through time. And then there are asynchronous siblings of these services, which are fine-tuned for large scale processing - if one wants to prepare ML features for the entire continent or get time-series for millions of agriculture parcels.

We will present the technology behind the scenes, making the processing possible, as well as several use-cases, how one can efficiently make use the service in ML.

How to cite: Milcinski, G. and Kolaric, P.: Sentinel Hub - federated on-demand ARD generation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4160, https://doi.org/10.5194/egusphere-egu23-4160, 2023.

EGU23-7233 | Posters on site | ESSI1.9

A flexible, scalable, cloud-native framework for geospatial modelling

Blair Edwards, Paolo Fraccaro, Nikola Stoyanov, Anne Jones, Junaid Butt, Julian Kuehnert, Andrew Taylor, and Bhargav Garikipati

Understanding and quantifying the risk of the physical impacts of climate change and their subsequent consequences have crucial importance in the changing climate for both businesses and society more widely. Historically, modelling workflows to assess such impacts have been bespoke and constrained by the data they can consume, the compute infrastructure, the expertise required to run them and the specific ways they are configured. Here we present, a cloud-native modelling framework for running geospatial models in a flexible, scalable, configurable, user-friendly manner. This enables models (physical or ML/AI) to be rapidly onboarded and composed into workflows. These workflows can be flexible, dynamic and extendable, running as for historical events, or as forecast ensembles, with varying data inputs, or extended to model impact in the real world (e.g. for example to infrastructure and populations). The framework supports the streamlined training and deployment of AI models, which can be seamlessly integrated with physical models to create hybrid workflows. We demonstrate the application and features of the framework for the examples of flooding and wildfire.

How to cite: Edwards, B., Fraccaro, P., Stoyanov, N., Jones, A., Butt, J., Kuehnert, J., Taylor, A., and Garikipati, B.: A flexible, scalable, cloud-native framework for geospatial modelling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7233, https://doi.org/10.5194/egusphere-egu23-7233, 2023.

EGU23-11709 | Posters on site | ESSI1.9

Microservice architecture to enable fast assessment of the NatCat events based on EO and geospatial data

Karolina Sarna, Johannes Hiekkasaari, and Joni Taajamo

Fast response to natural catastrophe events is crucial in our fast changing world. Creating comprehensible solutions based on the Earth Observation (EO) and geospatial data is complex and requires combining multiple data sources and maintaining high level of configuration parameters.

In this talk we discuss the application of microservices architecture to tackle some of the issues inherent to building products based on EO and geospatial data. We will present how decomposing sophisticated algorithms into small services can help with continuous delivery, scaling and deployment of large, complex applications that can be reused for various products. This architecture enables reproducibility of analysis which is a crucial component for applying machine learning and automation into any EO based product. We will also address the additional complexity of creating a distributed system as well as high dependency on data consistency and availability.

How to cite: Sarna, K., Hiekkasaari, J., and Taajamo, J.: Microservice architecture to enable fast assessment of the NatCat events based on EO and geospatial data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11709, https://doi.org/10.5194/egusphere-egu23-11709, 2023.

EGU23-13481 | Posters on site | ESSI1.9

Satellite data as a predictor for monitoring tree health and stress in reforestation projects

H. Gijs J. Van den Dool and Deepali Bidwai

In many parts of the world, reforestation is an ongoing activity, but due to the deforestation processes (e.g. the change in soil conditions, agricultural expansion, and infrastructure expansion such as urbanisation or road building), the success rate of replanting is far from sure; therefore, it is essential to:

have a good idea of the pre-planting conditions at the location,
monitor the growth,
improve the growing conditions whenever possible, and
adapt the site selection criteria

In the proposed method, it is not possible to change the site selections of already planted locations, but it is possible to monitor the selected location and check under which conditions the trees are growing best.

Several data sources are identified to predict plant health and stress, first to establish a baseline and, from this baseline, project into the future (short and mid-term). We compute the main vegetation index (NDVI) from the high-resolution image data provided by Planet (through the NICFI Basemaps for Tropical Forest Monitoring program). The historical NDVI values are obtained from the Sentinel 2 (and potentially LandSat) data at lower resolutions. Environmental conditions are added to the stress index by extracting the relevant meteorological parameters from the ERA5 database (temperature and precipitation) to compute the drought indices (e.g. KBDI/SPI/SPIE) and water availability (AWC) with the dominant soil type, supplemented with supporting indices from the satellite data (e.g. NDWI/SAVI/EVI-2).

For reforestation projects, it is vital to monitor the impact of environmental parameters on plant health and stress, and to assist with the forest maintenance of the sites, we built time series models for temperature, precipitation, and various vegetation indices to create a baseline for site-specific growing conditions. Deep Learning (DL) models like semantic segmentation based on Convolutional Neural Network (CNN) can be built on top of it using transfer learning to extract the features from pre-trained models using large (global) datasets. The model can not only predict tree health but can also be used to predict growing conditions in the near future by flagging out potential dry periods before they happen.

The high-resolution remote sensed products are available in the (sub)tropical zone [30N - 30S], while the lower resolution products and the ERA5 data have a global cover. The test sites in this study are example sites, but the developed method can be applied to any reforestation monitoring project. The result of the analysis is a near-term growth indicator, which can be used to adjust the growing conditions of the site, as well as assist with the site selection for new reforestation projects (based on the established baseline and predictions).

The next step, after validation, is to create a dashboard where the user can select any location (within the data domain) and construct the baseline and prediction, based on available information.

How to cite: Van den Dool, H. G. J. and Bidwai, D.: Satellite data as a predictor for monitoring tree health and stress in reforestation projects, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13481, https://doi.org/10.5194/egusphere-egu23-13481, 2023.

EGU23-13852 | Posters on site | ESSI1.9

Can machine learning help us to create improved and trustworthy satellite-based precipitation products?

Ioannis Tsoukalas, Panagiotis Kossieris, Luca Brocca, Silvia Barbetta, Hamidreza Mosaffa, and Christos Makropoulos

Key variable of earth observation (EO) systems is precipitation, as indicated by the wide spectrum of applications that is involved (e.g., water resources and early warning systems for flood/drought events). During the last decade, the EO community has put significant research efforts towards the development of satellite-based precipitation products (SPPs), however, their deployment in real-world applications has not yet reached the full potential, despite their ever-growing availability, spatiotemporal coverage and resolution. This may be associated with the reluctancy of end-users to employ SPPs, either worrying about uncertainty and biases inherited in SPPs or even due to the existence of multiple SPPs, whose performance fluctuates across the globe, and thus making it difficult to select the most appropriate SPP (some sort of a choice paradox). To address this issue, this work targets the development of an explainable machine learning approach capable of integrating multiple satellite-based precipitation (P) and soil moisture (SM) products into a single precipitation product. Hence, in principle, to create a new dataset that optimally combines the properties of each individual satellite dataset (used as predictors), better matching the ground-based observations (used as predictand, i.e., reference dataset). The proposed approach is showcased via a benchmark dataset consisted of 1009 cells/locations around the world (Europe, USA, Australia and India), highlighting its robustness as well as its application capability which are independent of specific climatic regimes and local peculiarities.

How to cite: Tsoukalas, I., Kossieris, P., Brocca, L., Barbetta, S., Mosaffa, H., and Makropoulos, C.: Can machine learning help us to create improved and trustworthy satellite-based precipitation products?, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13852, https://doi.org/10.5194/egusphere-egu23-13852, 2023.

EGU23-14353 | Posters on site | ESSI1.9

MLOps in practice: how to scale your geospatial practice with cloud-based shared MLOps platform

Frank de Morsier and Julien Rebetez

The key to drive innovation in EO science and applications, boosting geospatial mass adoption and in turn ‘geo-enabling’ companies, researchers and institutions, is moving away from complex, inefficient, and expensive workflows and making fundamental changes in ML practices. This is where geospatial MLOps, and platforms such as Picterra play a crucial role: cloud-native, shared platforms offer user friendly and efficient interfaces, smart toolkit and features paired with auto-scaling infrastructure and state of the art deep learning architecture. They allow to create and operate geospatial ML models at scale, enabling organizations to complete geospatial ML projects faster than ever before.

MLOps platforms systemize the process of building and training experimental machine learning models as well as translating them into production. This workflow efficiency empowers teams working with massive datasets, and allows organizations to leverage data analytics for decision-making and building better customer experiences.

Achieving productivity and speed requires streamlining and automating processes, as well as building reusable assets that can be managed closely for quality and risk. When significant model drift is detected, the ability to retrain and redeploy ML models in an automated fashion is crucial to ensure business continuity.

Shared platform, managed infrastructure, and integrable architecture results in streamlined pipelines and straightforward integration. This agility reduces the time to value and frees up time to serve more use cases, leading to increased value to the business. Companies implementing geospatial MLOps can speed up model training times, dramatically improve accuracy, and go from an idea to a live solution in just days – without increasing headcount or technical debt. Over time, they will also collect a library of strategic ML assets that will enable them to act on timely data - fast.

Using Picterra as a prime example of geospatial ML platform built with MLOps processes in its core, we will dive into how it facilitatesthe key steps of ML workflows incl:

Direct access to a diverse range of satellite imagery sources via the platform ie. Sentinel-1/2, Planetscope, open aerial imagery campaigns, ingesting WMS//XYZ server streams.
Compatibility with any geospatial imagery sources (e.g. Optical, SAR, hyperspectral, thermal infrared, etc.) and possibility to connect to data cloud storage or directly upload via web interface, besides the above mentioned images servers.
A unequalled MLOps interface to prototype the extraction of new information from imagery around any custom defined use case ie. biodiversity monitoring, crops mapping and classification, assets management and many more. Trained model are directly served and made available for inference at large scale.
Extensive toolset on explainable & interpretable AI which is bringing robustness & efficiency in creating geospatial Machine Learning models for example dataset exploration
Fast turnaround time in creating and validating Machine Learning models to save time and resources, thanks to the auto-scaling infrastructure leveraging Kubernetes and an intuitive interface for fast prototyping.
A unique set of advanced GIS pre/post-processing tools to manage imagery and the geospatial outputs extracted.
A complete API interface and Python library to further integrate with existing workflows or softwares (e.g. ESRI ArcGIS, Safe FME, etc.)

How to cite: de Morsier, F. and Rebetez, J.: MLOps in practice: how to scale your geospatial practice with cloud-based shared MLOps platform, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14353, https://doi.org/10.5194/egusphere-egu23-14353, 2023.

EGU23-16802 | Posters on site | ESSI1.9

eo-grow - Earth Observation framework for scaled-up processing in Python

Matej Batič, Žiga Lukšič, and Grega Milcinski

Analysing EO data is a complex process, and solutions often require custom tailored algorithms. On top of that, in the EO domain most problems come with an additional challenge: How can the solution be applied on a large scale?

Within the H2020 project Global Earth Monitor (GEM) we have updated and extended eo-learn with additional functionalities that allow for new approaches to scalable and cost-effective Earth Observation data processing. We have tied it with the Sentinel Hub’s unified main data interface (Process API), the Data Cube processing engine for constructing analysis-ready adjustable data cubes using Batch Process API, and, finally, the Statistical API and Batch Statistical API to streamline access to spatio-temporally aggregated satellite data.

As part of GEM processing framework, we have built eo-grow which facilitates extraction of valuable information from satellite imagery. eo-grow tackles the issues of scalability by enabling coordination of clusters to run the EO workflows over large areas using Ray. At the same time the framework provides reproducibility and traceability of the experiments using schemed input configurations and their validation.

In eo-grow a workflow based solution is wrapped into a pipeline object, which takes care of parametrization, logging, storage, multi-processing, data management and more. The pipeline object is configured via a well-defined schema allowing straightforward experimentation and scaling up: going to larger area of interest, running on different time interval, or tweak of any other pipeline parameter becomes just a matter of updating (json) configuration, which additionally serve as record of the experiment.

eo-grow library has been publicly released on GitHub: https://github.com/sentinel-hub/eo-grow. The documentation available in the repository provides the overview of the eo-grow general structure, its core objects, and instructions on installation and using eo-grow with command line interface. Additional repository, https://github.com/sentinel-hub/eo-grow-examples showcases eo-grow on a few use-cases.

In the presentation we will introduce the framework and showcase its usability on concrete examples. We will illustrate how eo-grow is used in large-scale research experiments, explain its role in reproducibility and show how the no-code approach and code reuse facilitate the productionalization of the workflows.

How to cite: Batič, M., Lukšič, Ž., and Milcinski, G.: eo-grow - Earth Observation framework for scaled-up processing in Python, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16802, https://doi.org/10.5194/egusphere-egu23-16802, 2023.

EGU23-17046 | ECS | Posters on site | ESSI1.9

Wayang AgoraEO Plugin: The Framework for Scalable EO Workflows

Rodrigo Pardo Meza, Jorge-Arnulfo Quiané-Ruiz, Begüm Demir, and Volker Markl

Wayang AgoraEO Plugin: The Framework for Scalable EO Workflows

Currently, Earth Observation (EO) platforms provide datasets, algorithms, and processing capabilities. Nevertheless, each platform proposes its own exclusive habitat to discover, process, and run EO elements. We recently proposed AgoraEO [2], a decentralized, open, and unified ecosystem, where users can find EO elements, compose cross-platform EO pipelines, and execute them efficiently. With this ambition of supporting cross-platform federated analytics, Agora EO relies on Apache Wayang [1] as its main analytical processing platform. Within AgoraEO, we are developing and enabling Apache Wayang with EO features, exposing the internals of BigEarthNet [2] to the Earth Observation community. Here we present our Wayang AgoraEO plugin that follows the BigEarthNet workflow to achieve all its benefits in a scalable and parameterizable (reusable) way. The Wayang AgoraEO plugin empowers users to create EO workflows, using any EO platform in a simple way: using operators and an intuitive API that follows the behaviors of the EO platforms it exploits. The execution of sub-tasks is controlled but isolated in any required data processing system in tandem with the rest of the platform. In addition, one can fetch datasets from several independent sources. By design, Apache Wayang works as a declarative framework for ML: Users specify ML tasks at a high level, using the most convenient API to write a workflow (Java-Scala, Python, and Postgres are supported). Wayang then models an ML task as a mathematical optimization problem and uses its gradient descent-based optimizer to invoke the appropriate physical algorithms and system configurations to execute a given ML task. Therefore, decoupling user specification of ML tasks from its execution. We believe the Wayang AgoraEO plugin can be a game changer in the tedious task of implementing and deploying EO workflows within EO platforms today: It makes it easy to reuse resources and share them. Likewise, it is an easily extensible solution to include new operators that can include new EO platforms and tasks. As a result, this solution can be a great leap in the democratization of EO technologies, contributing to their integration, scalability, and access to high-performance computing.

References

[1] S. Kruse, Z. Kaoudi, J. -A. Quiane-Ruiz, S. Chawla, F. Naumann and B. Contreras-Rojas, "Optimizing Cross-Platform Data Movement," IEEE 35th International Conference on Data Engineering, 2019, pp. 1642-1645.

[2] A. Wall, B. Deiseroth, E. Tzirita Zacharatou, J-A, Quiané-Ruiz, B. Demir, V. Markl, "AGORA-EO: A Unified Ecosystem for Earth Observation - A Vision For Boosting EO Data Literacy," Big Data from Space Conference, 2021.

How to cite: Pardo Meza, R., Quiané-Ruiz, J.-A., Demir, B., and Markl, V.: Wayang AgoraEO Plugin: The Framework for Scalable EO Workflows, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17046, https://doi.org/10.5194/egusphere-egu23-17046, 2023.

EGU23-245 | Posters on site | HS3.3

Machines simulate hydrologic processes using a simple structure but in a unique manner – a case study of predicting fine scale watershed response on a distributed framework

Dongkyun Kim and Yongoh Lee

An LSTM-based distributed hydrologic model for an urban watershed of Korea was developed. The input of the model is the time series of the 10-minute radar-gauge composite rainfall data and 10-minute temperature data at the 239 model grid cells, and the output of the model is the 10-minute flow discharge at the watershed outlet. The Nash-Sutcliffe Efficiency (NSE) coefficients of the calibration period (2013-2016) and validation period (2017-2019) were 0.99 and 0.67, respectively. Normal events were better predicted than the extreme ones. Further in-depth analyses revealed that: (1) the model composes the watershed outlet flow discharge by linearly superimposing multiple time series created by each of the LSTM units. Unlike conventional hydrologic models, most of these time series greatly fluctuated in both positive and negative domain; (2) the runoff to rainfall ratio of each of the model grid cells does not reflect its counterpart parameters of the conceptual hydrologic models revealing that the model simulates the watershed responses in a unique manner; (3) the model successfully reproduced the soil-moisture dependent runoff processes, which is an essential prerequisite of continuous hydrologic models; (4) Each of the LSTM units have different temporal sensitivity to a unit rainfall stimulus, and the LSTM units that is sensitive to rainfall input have greater output weight factors nearby the watershed outlet, and vice versa. This means that the model learned a mechanism to separately consider the hydrologic components with distinct response time such as direct runoff and the low frequency baseflow.

Acknowledgement

This research was supported by the Basic Science Research Program (Grant Number: 2021R1A2C2003471) and the Basic Research Laboratory Program (Grant Number: 2022R1A4A3032838) through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT.

How to cite: Kim, D. and Lee, Y.: Machines simulate hydrologic processes using a simple structure but in a unique manner – a case study of predicting fine scale watershed response on a distributed framework, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-245, https://doi.org/10.5194/egusphere-egu23-245, 2023.

EGU23-339 | ECS | Posters on site | HS3.3

Effectiveness of Satellite-based Vegetation Index for Simulating Watershed Response Using an LSTM-based model in a Distributed Framework

Jeonghun Lee and Dongkyun Kim

This study developed a distributed hydrologic model based on Long Short-Term Memory (LSTM) to predict flow discharge of Joongrang stream located in a highly urbanized area in Seoul, Korea. The model inputs are the time series of 10-minute radar-gauge composite precipitation data at 239 grid cells (1km²) in the watershed and the Normalized Difference Vegetation Index (NDVI) data derived from Landsat 8 images and the model output is the 10-minute flow discharge at the watershed outlet as output. The model was trained for the calibration period of 2013-2016 and was validated for the period of 2017-2019. The NSE value over the validation period corresponding to the optimal model architecture (256 LSTM hidden layers) with and without NDVI input data was 0.68 and 0.52, respectively, which suggests that the machine can learn dynamic processes of soil infiltration and plant interception from the remotely sensed information provided by satellite.

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. NRF-2022R1A4A3032838).

How to cite: Lee, J. and Kim, D.: Effectiveness of Satellite-based Vegetation Index for Simulating Watershed Response Using an LSTM-based model in a Distributed Framework, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-339, https://doi.org/10.5194/egusphere-egu23-339, 2023.

EGU23-1218 | Posters on site | HS3.3

Exploring the Value of Natural Language Processing for Urban Water Research

Ina Vertommen, Xin Tian, Tessa Pronk, Siddharth Seshan, Sotirios Paraskevopoulos, and Bas Wols

Natural Language Processing (NLP), empowered by the most recent developments in Deep Learning, demonstrates its potential effectiveness for handling texts. Urban water research benefits from both subfields of NLP, namely, Natural Language Understanding (NLU) and Natural Language Generation (NLG). In this work, we present three recent studies that use NLP for: (1) automated processing and responding to registered customer complaint within Dutch water utilities, (2) automated collection of up-to-date water-related information from the Internet, (3) extraction of key information about chemical compounds and pathogen characteristics from scientific publications. These applications, using the latest NLP models and tools (e.g., Rasa, Spacy), take into account studies on both water quality and quantity for the water sector. According to our findings, NLU and rule-based text mining are effective in extracting information from unstructured texts. In addition, NLU and NLG can be integrated to build a human-computer interface, such as a value-based Chabot to understand and address the demands made by customers of water utilities.

How to cite: Vertommen, I., Tian, X., Pronk, T., Seshan, S., Paraskevopoulos, S., and Wols, B.: Exploring the Value of Natural Language Processing for Urban Water Research, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1218, https://doi.org/10.5194/egusphere-egu23-1218, 2023.

EGU23-1278 | ECS | Orals | HS3.3

Evaluating Machine Learning Approach for Regional Flood Frequency Analysis in Data-sparse Regions

Nikunj K. Mangukiya and Ashutosh Sharma

Accurate flood frequency analysis is essential for developing effective flood management strategies and designing flood protection infrastructure, but it is challenging due to the complex, nonlinear hydrological system. In regional flood frequency analysis (RFFA), the flood quantiles at ungauged sites can be estimated by establishing a relationship between interdependent physio-meteorological variables and observed flood quantiles at gauge sites in the region. However, this regional approach implies a loss of information due to the prior aggregation of hydrological data at gauged locations and can be difficult for data-sparse regions due to limited data. In this study, we evaluated an alternate approach or path for RFFA in two case studies: a data-sparse region in India and a data-dense region in the USA. In this approach, daily streamflow is predicted first using a deep learning-based hydrological model, and then flood quantiles are estimated from the predicted daily streamflow using statistical methods. We compared the results obtained using this alternate approach to those from the traditional RFFA technique, which used the Random Forest (RF) and eXtreme Gradient Boosting (XGB) algorithms to model the nonlinear relationship between flood quantiles and relevant physio-meteorological predictor variables such as meteorological forcings, topography, land use, and soil properties. The results showed that the alternate approach produces more reliable results with the least mean absolute error and higher coefficient of determination in the data-sparse region. In the data-dense region, both traditional and alternate approaches produced comparable results. However, the alternate approach has the advantage of being flexible and providing the complete time series of daily flow at the ungauged location, which can be used to estimate other flow characteristics, develop flow duration curves, or estimate flood quantiles of any return period without creating a separate traditional RFFA model. This study shows that the alternate approach can provide accurate flood frequency estimates in data-sparse regions, offering a promising solution for flood management in these areas.

How to cite: Mangukiya, N. K. and Sharma, A.: Evaluating Machine Learning Approach for Regional Flood Frequency Analysis in Data-sparse Regions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1278, https://doi.org/10.5194/egusphere-egu23-1278, 2023.

EGU23-1526 | ECS | Orals | HS3.3

Extrapo… what? Predictions beyond the support of the training data

Ralf Loritz and Hoshin Gupta

EGU23-2382 | Posters virtual | HS3.3

Reconstruct karst spring discharge data with hybrid deep learning models and ensemble empirical mode decomposition method

Renjie Zhou and Yanyan Zhang

Having a continuous and complete karst discharge data record is necessary to understand hydrological behaviors of the karst aquifer and manage karst water resources. However, caused by many problems such as equipment errors and failure of observation, lots of hydrological and research dataset contains missing spring discharge values, which becomes a main barrier for further environmental and hydrological modeling and studies. In this work, a novel approach that integrates deep learning algorithms and ensemble empirical mode decomposition (EEMD) is developed to reconstruct missing karst spring discharge values with the local precipitation. EEMD is firstly employed to decompose the precipitation data, extract useful features, and remove noises. The decomposed precipitation components are then fed as input data to various deep learning models for performance comparison, including convolutional neural network (CNN), long short-term memory (LSTM), and hybrid CNN-LSTM models to reconstruct the missing discharge values. Root mean squared error (RMSE) and Nash–Sutcliffe efficiency coefficient (NSE), are calculated to evaluate the reconstruction performance as metrics. The models are validated with the spring discharge and precipitation data collected at Barton Spring in Texas. The reconstruction performance of various deep learning models with and without EEMD are compared and evaluated. The main conclusions can be summarized as: 1) by using EEMD, the integrated deep models significantly improve reconstruction performance and outperform the simple deep models; 2) among three integrated models, the LSTM-EEMD model obtains the best reconstruction results among three deep learning algorithms; 3) For models with monthly data, the reconstruction performance decreases greatly with the increase of missing rate: the best reconstruction results are obtained when the missing rate is low. If the missing rate was 50%, the reconstruction results become notably poorer. For models with daily data, the reconstruction performance is less impacted by the missing rate and the models can obtain satisfactory reconstruction results when missing rates range from 10% to 50%.

How to cite: Zhou, R. and Zhang, Y.: Reconstruct karst spring discharge data with hybrid deep learning models and ensemble empirical mode decomposition method, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2382, https://doi.org/10.5194/egusphere-egu23-2382, 2023.

EGU23-3125 | Orals | HS3.3

Exploring different explainable artificial intelligence algorithms applied to a LSTM for streamflow modelling

Alexander Ley, Helge Bormann, and Markus Casper

Machine Learning and Deep Learning have been proving their potential for streamflow modelling in various studies. In particular, long short-term memory (LSTM) models showed exceptionally good results. However, machine learning models often are considered “black boxes” with limited interpretability. Explainable artificial intelligence (XAI) comprise methods that analyze the internal processes of the machine learning network and allow to have a glance in the “black box”. Most proposed XAI techniques are designed for the analysis of images, and there is currently only limited work on time series data available.

In our study, we applied various XAI algorithms including gradient-based methods (Saliency, InputXGradient, Integrated Gradient, GradientSHAP) but also perturbation-based methods (Feature Ablation, Feature Permutation) to compare their applicability for reasonable interpretation in the hydrological context. To our knowledge, only Integrated Gradient has been applied to a LSTM in hydrology so far. Gradient-based methods analyze the gradient of the output with respect to the input feature. Whereas perturbation-based methods gain information by altering or masking specific input features. The different methods were applied to a LSTM trained for the low-land Ems catchment in Germany, which has a major baseflow share of total streamflow.

We analyzed the results regarding their “timestep of influence”, which describes the amount of past days having importance for the prediction of streamflow at a particular day. All of the algorithms applied result in a comparable annual pattern, characterized by relatively small timesteps of influence in spring (wet season) and increasing timesteps of influence in summer and autumn (dry season). However, the range of the absolute days of attribution varies between the methods. In conclusion, all methods produces reasonable results and appear to be suitable for interpretation purposes.

Furthermore, we compare the results to ERA-5 reanalysis data and gained evidence that the LSTM recognizes soil water storage as the main driver for streamflow generation in the catchment: we found an inverse seasonality of soil moisture and timestep of influence.

How to cite: Ley, A., Bormann, H., and Casper, M.: Exploring different explainable artificial intelligence algorithms applied to a LSTM for streamflow modelling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3125, https://doi.org/10.5194/egusphere-egu23-3125, 2023.

EGU23-4137 | ECS | Orals | HS3.3

Sequential optimization of temperature measurements to estimate groundwater-surface water interactions

Robin Thibaut, Ty Ferré, Eric Laloy, and Thomas Hermans

The groundwater-surface water (GW-SW) exchange fluxes are driven by a complex interplay of subsurface processes and their interactions with surface hydrology, which have a significant impact on the water and contaminant exchanges. Due to the complexity of these systems, the accurate estimation of GW-SW fluxes is important for quantitative hydrological studies and should be based on relevant data and careful experimental design. Therefore, the effective design of monitoring networks that can identify relevant subsurface information are essential for the optimal protection of our water resources. In this study, we present novel deep learning (DL)-driven approaches for sequential and static Bayesian optimal experimental design (BOED) in the subsurface, with the goal of estimating the GW-SW exchange fluxes from a set of temperature measurements. We apply probabilistic Bayesian neural networks (PBNN) to conditional density estimation (CDE) within a BOED framework, and the predictive performance of the PBNN-based CDE model is evaluated by a custom objective function based on the Kullback-Leibler divergence to determine optimal temperature sensor locations utilizing the information gain provided by the measurements. This evaluation is used to determine the optimal sequential sampling strategy for estimating GW-SW exchange fluxes in the 1D case, and the results are compared to the static optimal sampling strategy for a 3D conceptual riverbed-aquifer model based on a real case study. Our results indicate that probabilistic DL is an effective method for estimating GW-SW fluxes from temperature data and designing efficient monitoring networks. Our proposed framework can be applied to other cases involving surface or subsurface monitoring and experimental design.

How to cite: Thibaut, R., Ferré, T., Laloy, E., and Hermans, T.: Sequential optimization of temperature measurements to estimate groundwater-surface water interactions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4137, https://doi.org/10.5194/egusphere-egu23-4137, 2023.

EGU23-4177 | ECS | Posters virtual | HS3.3

Evaluation of regional Rainfall-Runoff modelling using convolutional long short-term memory: CAMELS dataset in US as a case study.

Abdalla Mohammed and Gerald Corzo

Rainfall-runoff (RR) modeling remains a challenging task in the field of hydrology especially when it comes to regional scale hydrology. Recently, the Long Short-Term Memory (LSTM) - which is known for its ability to learn sequential and temporal relations - has been widely adopted in RR modeling. The Convolutional Neural Networks (CNN) have matured enough in computer vision tasks, and trials were conducted to use them in hydrological applications. Different combinations of CNN and LSTM have proved to work; however, questions remain about suitability of different model architectures, the input variables needed for the model and the interpretability of the learning process of the models for regional scale.

In this work we trained a sequential CNN-LSTM deep learning architecture to predict daily streamflow between 1980 and 2014, regionally and simultaneously, over 86 catchments from CAMELS dataset in the US. The model was forced using year-long spatially distributed (gridded) input with precipitation, maximum temperature and minimum temperature for each day, to predict one day streamflow. The model takes advantage of the CNN to encode the spatial patterns in the input tensor, and feed them to the LSTM for learning the temporal relations between them. The trained model was further fine-tuned to predict for 3 local sub-clusters of the 86 stations. This was made in order to test the significance of fine-tuning in the performance and model learning process. Also, to interpret the spatial patterns learning process, a perturbation was introduced in the gridded input data and the sensitivity of the model output to the perturbation was shown in spatial heat maps. Finally, to evaluate the performance of the model, different benchmark models were trained using -as possible- a similar training setup as for the CNN-LSTM model. These models are CNN without the LSTM part (regional model), LSTM without CNN part (regional model), simple single-layer ANN (regional model), and LSTM trained for individual stations (considered as state of the art). All of these benchmark models have been fined-tuned for the 3 clusters as well.

CNN-LSTM model, after being fine-tuned, performed well predicting daily streamflow over the test period with a median Nash-Sutcliffe efficiency (NSE) of 0.62 and 65% of the 86 stations with NSE > 0.6 outperforming all benchmark models that were trained regionally using the same training setup. The model also achieved a comparable performance as for the -state of the art- LSTM trained for individual stations. Fine-tuning improved the performance for all of the models during the test period. The CNN-LSTM model, was shown to be more sensitive to input perturbations near the stations in which the prediction is intended. This was even clearer for the fine-tuned model, indicating that the model is learning spatially relevant information from the input gridded data, and fine tuning is helping on guiding the model to focus more on the relevant input.

This work shows the potential of CNN and LSTM for regional Rainfall-runoff modeling by capturing spatiotemporal patterns involved in RR process. The work, also, contributes toward more physically interpretable data-driven modeling paradigm.

How to cite: Mohammed, A. and Corzo, G.: Evaluation of regional Rainfall-Runoff modelling using convolutional long short-term memory: CAMELS dataset in US as a case study., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4177, https://doi.org/10.5194/egusphere-egu23-4177, 2023.

EGU23-4179 | Orals | HS3.3

Improving Data-Driven Flow Forecasting in Large Basins using Machine Learning to Route Flows

David Lambl, Mostafa Elkurdy, Phil Butcher, Laura K Read, and Alden Keefe Sampson

Producing accurate hourly streamflow forecasts in large basins is difficult without a distributed model to represent both streamflow routing through the river network and the spatial heterogeneity of land and weather conditions. HydroForecast is a theory-guided deep learning flow forecasting product that consists of short-term (hourly predictions out to 10 days), seasonal (10 day predictions out to a year), and daily reanalysis models. This work focuses primarily on the short-term model which has award winning accuracy across a wide range of basins.

In this work, we discuss the implementation of a novel distributed flow forecasting capability of HydroForecast, which splits basins into smaller sub-basins and routes flows from each subbasin to the downstream forecast points of interest. The entire model is implemented as a deep neural network allowing end-to-end training of both sub-basin runoff prediction and flow routing. The model's routing component predicts a unit hydrograph of flow travel time at each river reach and timestep allowing us to inspect and interpret the learned river routing and to seamlessly incorporate any upstream gauge data.

We compare the accuracy of this distributed model to our original flow forecasting model at selected sites and discuss future improvements that will be made to this model.

How to cite: Lambl, D., Elkurdy, M., Butcher, P., Read, L. K., and Sampson, A. K.: Improving Data-Driven Flow Forecasting in Large Basins using Machine Learning to Route Flows, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4179, https://doi.org/10.5194/egusphere-egu23-4179, 2023.

EGU23-4801 | Posters on site | HS3.3

Improving Streamflow Predictions over Indian Catchments using Long Short Term Memory Networks

Bhanu Magotra, Manabendra Saharia, and Chandrika Thulaseedharan Dhanya

Streamflow modelling plays a critical role in water resource management activities. The “physically based" models require high computation resources and large amounts of input meteorological data which results in high operating costs and longer running times. On the other hand, with advancements in deep-learning techniques, data-driven models such as long short-term memory (LSTM) networks have been shown to successfully model non-linear rainfall-runoff relationships through historically observed data at a fraction of computation cost. Moreover, using physics-informed machine learning techniques, the physical consistency of data-driven models can be further improved. In this study, one such method is applied where we trained a physics-informed LSTM network model over 278 Indian catchments to simulate streamflow at a daily timestep using historically observed precipitation and streamflow data. The ancillary data included meteorological forcings, static catchment attributes, and Noah-MP simulated land surface states and fluxes such as soil moisture, latent heat, and total evapotranspiration. The LSTM model's performance was evaluated using error metrics such as Nash-Sutcliffe Efficiency (NSE), Kling-Gupta Efficiency (KGE) and its components, along with skill scores based on 2x2 contingency matrix for hydrological extremes. The trained LSTM model shows improved performance in simulating streamflow over the catchments compared to the physically based model. This will be the first study over India to generate reliable streamflow simulations using a hybrid state-of-the-art approach, which will be beneficial to policy makers for effective water resource management in India.

How to cite: Magotra, B., Saharia, M., and Dhanya, C. T.: Improving Streamflow Predictions over Indian Catchments using Long Short Term Memory Networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4801, https://doi.org/10.5194/egusphere-egu23-4801, 2023.

EGU23-4842 | ECS | Orals | HS3.3

Introducing DL-GLOBWB: a deep-learning surrogate of a process-based global hydrological model

Bram Droppers, Myrthe Leijnse, Marc F.P. Bierkens, and Niko Wanders

Process-based global hydrological models are an important tool for sustainable development and policy making in today’s water-scarce world. These models are able to inform national to regional scale water management with basin-scale accounting of water availability and demand and project the impacts of climate change and adaptation on water resources. However, the increasing need for better and higher resolution hydrological information is proving difficult for these state-of-the-art process-based models as the associated computational requirements are significant.

Recently, the deep-learning community has shown that neural networks (in particular the LSTM network) can provide hydrological information with an accuracy that rivals, if not exceeds, that of process-based hydrological models. Although the training of these neural networks takes time, prediction is fast compared to process-based simulations. Nevertheless, training is mostly done on historical observations and thus projections under climate change and adaptation are uncertain.

Inspired by the complementary strengths and weaknesses of the process-based and deep-learning approaches, we present DL-GLOBWB: a deep-learning surrogate of the state-of-the-art PCR-GLOBWB global hydrological model. DL-GLOBWB predicts all water-balance components from the process-based model, including human water demand and abstraction, with a nRSME of 0.05 (range between 0.0001 and 0.32). The DL-GLOBWB surrogate is orders of magnitudes faster than its process-based counterpart, especially as surrogates trained at low resolutions (e.g. 30 arc-minute) can effectively be downscaled to higher resolutions (e.g. 5 arc-minute).

In addition to introducing DL-GLOBWB, our presentation will explore future applications of this deep-learning surrogate, such as (1) improving model calibration and performance by comparing DL-GLOBWB outputs with ins-situ data and satellite observations; (2) training DL-GLOBWB on future model projections to include global change; and (3) the implementation of DL-GLOBWB to dynamically, and at high resolution, visualize the impact of climate change and adaptation to stakeholders.

How to cite: Droppers, B., Leijnse, M., Bierkens, M. F. P., and Wanders, N.: Introducing DL-GLOBWB: a deep-learning surrogate of a process-based global hydrological model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4842, https://doi.org/10.5194/egusphere-egu23-4842, 2023.

EGU23-4887 | Orals | HS3.3

IMERG Run Deep: Can we produce a low-latency IMERG Final run product with a deep learning based prediction model?

Ho Tin Hung and Li-Pen Wang

IMERG is a global satellite-based precipitation dataset, produced by NASA. It has provided valuable rainfall information to facilitate the design or the operation of the disaster and risk management worldwide. In operation, NASA offers three types of IMERG Level 3 (L3) products, with different levels of trade-offs in terms of time latency and accuracy. These are Early run (4-hour latency), Late run (14-hour latency) and Final run(3.5-month latency). The final-run product integrates multi-sensor retrievals and provides the highest-quality precipitation estimates among three IMERG products. It however suffers from a long processing latency, which hinders its applicability to near real-time applications. In the past 10 years, deep learning techniques have made significant breakthroughs in various scientific fields, including short-term rainfall forecasting. Deep learning models have shown to have the potential to learn the complex variations in weather systems and to outperform the Numerical Weather Prediction (NWP) in terms of short lead-time predictability and the required computational resources for operation.

In this research, we would like to explore the potential of deep learning (DL) in generating high-quality satellite-based precipitation product with low latency. More specifically, we investigate if DL models can learn the difference between Final- and Early-run products, and thus predict a Final-run-like product using Early-run product as input. Low-latency yet high-quality IMERG precipitation product can be therefore obtained. Various DL techniques are being tested in this work, including Auto-Encoder(AE), ConvLSTM and Deep Generative model. IMERG data between 2018 and 2020 over a rectangular area centred in the UK is used for model training and testing, and ground rain gauge records will be used to evaluate the performance of the original and predicted products. This pilot includes both ocean and land regions, which enables the comparison of the model performance between two different surface conditions. Preliminary analysis suggests that given patterns do exist in the differences between Early- and Final-run products, and the capacity of the selected DL models to learn the differences will be further investigated. The proposed work is of great potential to improve the applicability of IMERG products in an operational context.

How to cite: Hung, H. T. and Wang, L.-P.: IMERG Run Deep: Can we produce a low-latency IMERG Final run product with a deep learning based prediction model?, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4887, https://doi.org/10.5194/egusphere-egu23-4887, 2023.

EGU23-4970 | ECS | Posters on site | HS3.3

Use of Long-Short Term Memory network (LSTM) in the reconstruction of missing water level data in the Seine River.

Imad Janbain, Julien Deloffre, Abderrahim Jardani, Minh Tan Vu, and Nicolas Massei

Missing data is the first major problem that appears in many database fields for a set of reasons. It has always been necessary to fill them, which becomes unavoidable and more complicated when the missing periods are longer. Several machine-learning-based approaches have been introduced to deal with this problem.

The purpose of this paper is to discuss the effectiveness of a new methodology added prior to the LSTM deep learning algorithm to fill in the missing data in the hourly surface water level time series of some stations installed along the Seine River in Normandy-France. In our study, due to a lack of data, a challenging situation was faced where only the water level data in the same station, which contain many missing parts, were used as input and output variables to fill the station itself in a self-learning approach. This contrasts with the common work on imputing missing data, where several features are available to take advantage of in a multivariate and spatiotemporally way, e.g.: using the same variable from other stations or exploiting other physical variables and metrological data, etc. The reconstruction accuracy of the proposed method depends on both the size of the available/missing data and the parameters of the networks. Therefore, we performed sensitivity analyses on both the properties of the networks and the structuring of the input and output data to better determine the appropriate strategy. During this analysis process, a data preprocessing method was developed and added prior to the LSTM model. This data processing method was discovered by presenting many scenarios, each of which was an updated version of the last one. Along with these scenarios, limitations were also addressed and overcome. Finally, the last model version was able to impute missing values that may reach one year of hourly data with high accuracy (One-year RMSE = 0.14 m) regardless of neither the location of the missing part in the series nor its size.

How to cite: Janbain, I., Deloffre, J., Jardani, A., Vu, M. T., and Massei, N.: Use of Long-Short Term Memory network (LSTM) in the reconstruction of missing water level data in the Seine River., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4970, https://doi.org/10.5194/egusphere-egu23-4970, 2023.

EGU23-5044 | Orals | HS3.3

The role of ensemble learning in multi-optimization for streamflow prediction

Dagang Wang

The objective function plays an important role in the training process for deep learning models, since it largely determines the trained values of the model parameters and influences the model performance. In this study, we establish two application-orientated objective functions, namely high flow balance error (HFBE) and transformed mean absolute percentage error (MAPE*), for the forecasts of high flows and low flows, respectively, in the LSTM model. We examine the strength and weakness of these streamflow forecast models trained on HFBE, MAPE* and mean square error (MSE) based on multiple performance metrics. Furthermore, we propose the objective function-based ensemble model (OEM) framework that integrates the models trained on different objective functions, so as to take advantages of the trained models focusing on different aspects of streamflow and thus achieve a better overall performance. Our results in 273 catchments over USA show that the models trained on HFBE can alleviate underestimation in high flows existing in the models trained on MSE, and perform remarkably better for high flows. It is also found that the models trained on MAPE* outperform the other two models in low flow forecast, no matter what algorithm is used for the model establishment. By incorporating the three models trained on HFBE, MAPE* and MSE, respectively, our proposed OEM performs well in the forecasts of both high flows and low flows, and realistically capture the mean and the variability of the observational streamflow under different scenarios under a variety of hydrometeorological conditions. This study highlights the necessity of applying application-orientated objective functions for given projects and the great potential of the ensemble learning methods for multi-optimization in hydrological modeling.

How to cite: Wang, D.: The role of ensemble learning in multi-optimization for streamflow prediction, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5044, https://doi.org/10.5194/egusphere-egu23-5044, 2023.

EGU23-5199 | ECS | Posters virtual | HS3.3

How do machine learning models deal with inter-catchment groundwater flows?

Nicolas Weaver, Taha-Abderrahman El-Ouahabi, Thibault Hallouin, François Bourgin, Charles Perrin, and Vazken Andréassian

Machine learning models have recently gained popularity in hydrological modelling at the catchment scale, fuelled by the increasing availability of large-sample data sets and the increasing accessibility of deep learning frameworks, computing environments, and open-source tools. In particular, several large-sample studies at daily and monthly time scales across the globe showed successful applications of the LSTM architecture as a regional model learning of the hydrological behaviour at the catchment scale. Yet, a deeper understanding of how machine learning models close the water balance and how they deal with inter-catchment groundwater flows is needed to move towards better process understanding. We investigate the performance and behaviour of the LSTM architecture at a monthly time step on a large sample French data set coined CHAMEAU – following the CAMELS initiative. To provide additional information to the learning step of the LSTM, we use the parameter sets and fluxes from the conceptual GR2M model that has a dedicated formulation to deal with inter-catchment groundwater flows. We see this study as a contribution towards the development of hybrid hydrological models.

How to cite: Weaver, N., El-Ouahabi, T.-A., Hallouin, T., Bourgin, F., Perrin, C., and Andréassian, V.: How do machine learning models deal with inter-catchment groundwater flows?, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5199, https://doi.org/10.5194/egusphere-egu23-5199, 2023.

EGU23-5445 | ECS | Posters on site | HS3.3

Physics-Informed Neural Networks for Statistical Emulation of Hydrodynamical Numerical Models

James Donnelly, Alireza Daneshkhah, and Soroush Abolfathi

The application of numerical models for flood and inundation modelling has become widespread in the past decades as a result of significant improvements in computational capabilities. Computational approaches to flood forecasting have significant benefits compared to empirical approaches which estimate statistical patterns of hydrological variables from observed data. However, there is still a significant computational cost associated with numerical flood modelling at high spatio-temporal resolutions. This limitation of numerical modelling has led to the development of statistical emulator models, machine learning (ML) models designed to learn the underlying generating process of the numerical model. The data-driven approach to ML involves relying entirely upon a set of training data to inform decisions about model selection and parameterisations. Deep learning models have leveraged data-driven learning methods with improvements in hardware and an increasing abundance of data to obtain breakthroughs in various fields such as computer vision, natural language processing and autonomous driving. In many scientific and engineering problems however, the cost of obtaining data is high and so there is a need for ML models that are able to generalise in the ‘small-data’ regime common to many complex problems. In this study, to overcome extrapolation and over-fitting issues of data-driven emulators, a Physics-Informed Neural Network model is adopted for the emulation of all two-dimensional hydrodynamic models which model fluid according the shallow water equations. This study introduces a novel approach to encoding the conservation of mass into a deep learning model, with additional terms included in the optimisation criterion, acting to regularise the model, avoid over-fitting and produce more physically consistent predictions by the emulator.

How to cite: Donnelly, J., Daneshkhah, A., and Abolfathi, S.: Physics-Informed Neural Networks for Statistical Emulation of Hydrodynamical Numerical Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5445, https://doi.org/10.5194/egusphere-egu23-5445, 2023.

EGU23-5736 | ECS | Orals | HS3.3

A Novel Workflow for Streamflow Prediction in the Presence of Missing Gauge Observations

Rendani Mbuvha, Peniel Julien Yise Adounkpe, Mandela Coovi Mahuwetin Houngnibo, and Nathaniel Newlands

Streamflow predictions are a vital tool for detecting flood and drought events. Such predictions are even more critical to Sub-Saraharan African regions that are vulnerable to the increasing frequency and intensity of such events. These regions are sparsely gauged, with few available gauging stations that are often plagued with missing data due to various causes, such as harsh environmental conditions and constrained operational resources.

This work presents a novel workflow for predicting streamflow in the presence of missing gauge observations. We leverage bias correction of the GEOGloWS ECMWF streamflow service (GESS) forecasts for missing data imputation and predict future streamflow using the state-of-the-art Temporal Fusion transformers at ten river gauging stations in the Benin Republic.

We show by simulating missingness in a testing period that GESS forecasts have a significant bias that results in poor imputation performance over the ten Beninese stations. Our findings suggest that overall bias correction by Elastic Net and Gaussian Process regression achieves superior performance relative to traditional imputation by established methods such as Random Forest, k-Nearest Neighbour, and GESS lookup. We also show that the Temporal Fusion Transformer yields high predictive skill and further provides explanations for predictions through the weights of its attention mechanism. The findings of this work provide a basis for integrating Global streamflow prediction model data and state-of-the-art machine learning models into operational early-warning decision-making systems (e.g., flood/ drought alerts) in resource-constrained countries vulnerable to drought and flooding due to extreme weather events.

How to cite: Mbuvha, R., Adounkpe, P. J. Y., Houngnibo, M. C. M., and Newlands, N.: A Novel Workflow for Streamflow Prediction in the Presence of Missing Gauge Observations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5736, https://doi.org/10.5194/egusphere-egu23-5736, 2023.

EGU23-6313 | ECS | Posters on site | HS3.3

Moving away from deterministic solutions: A probabilistic machine learning approach to account for geological model uncertainty in groundwater modelling

Mathias Busk Dahl, Troels Norvin Vilhelmsen, Rasmus Bødker Madsen, and Thomas Mejer Hansen

Decision-making related to groundwater management often relies on results from a deterministic groundwater model representing one ‘optimal’ solution. However, such a single deterministic model lacks representation of subsurface uncertainties. The simplicity of such a model is appealing, as typically only one is needed, but comes with the risk of overlooking critical scenarios and possible adverse environmental effects. Instead, we argue, that groundwater management should be based on a probabilistic model that incorporates the uncertainty of the subsurface structures to the extent that it is known. If such a probabilistic model exists, it is, in principle, simple to propagate the uncertainties of the model parameter using multiple numerical simulations, to allow a quantitative and probabilistic base for decision-makers. However, in practice, such an approach can become computationally intractable. Thus, there is a need for quantifying and propagating the uncertainty numerical simulations and presenting outcomes without losing the speed of the deterministic approach.

This presentation provides a probabilistic approach to the specific groundwater modelling task of determining well recharge areas that accounts for the geological uncertainty associated with the model using a deep neural network. The results of such a task are often part of an investigation for new abstraction well locations and should, therefore, present all possible outcomes to give informative decision support. We advocate for the use of a probabilistic approach over a deterministic one by comparing results and presenting examples, where probabilistic solutions are essential for proper decision support. To overcome the significant increase in computation time, we argue that this problem can be solved using a probabilistic neural network trained on examples of model outputs. We present a way of training such a network and show how it performs in terms of speed and accuracy. Ultimately, this presentation aims to contribute with a method for incorporating model uncertainty in groundwater modelling without compromising the speed of the deterministic models.

How to cite: Busk Dahl, M., Norvin Vilhelmsen, T., Bødker Madsen, R., and Mejer Hansen, T.: Moving away from deterministic solutions: A probabilistic machine learning approach to account for geological model uncertainty in groundwater modelling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6313, https://doi.org/10.5194/egusphere-egu23-6313, 2023.

EGU23-6466 | ECS | Orals | HS3.3

Neural ODE Models in Large-Sample Hydrology

Marvin Höge, Andreas Scheidegger, Marco Baity-Jesi, Carlo Albert, and Fabrizio Fenicia

Neural Ordinary Differential Equation (ODE) models have demonstrated high potential in providing accurate hydrologic predictions and process understanding for single catchments (Höge et al., 2022). Neural ODEs fuse a neural network model core with a mechanistic equation framework. This hybrid structure offers both traceability of model states and processes, like in conceptual hydrologic models, and the high flexibility of machine learning to learn and refine model interrelations. Aside of the functional dependence of internal processes on driving forces, like of evapotranspiration on temperature, Neural ODEs are also able to learn the effect of catchment-specific attributes, e.g. land cover types, on processes when being trained over multiple basins simultaneously.

We demonstrate the performance of a generic Neural ODE architecture in a hydrologic large-sample setup with respect to both predictive accuracy and process interpretability. Using several hundred catchments, we show the capability of Neural ODEs to learn the general interplay of catchment-specific attributes and hydrologic drivers in order to predict discharge in out-of-sample basins. Further, we show how functional relations learned (encoded) by the neural network can be translated (decoded) into an interpretable form, and how this can be used to foster understanding of processes and the hydrologic system.

Höge, M., Scheidegger, A., Baity-Jesi, M., Albert, C., & Fenicia, F.: Improving hydrologic models for predictions and process understanding using Neural ODEs. Hydrol. Earth Syst. Sci., 26, 5085-5102, https://hess.copernicus.org/articles/26/5085/2022/

How to cite: Höge, M., Scheidegger, A., Baity-Jesi, M., Albert, C., and Fenicia, F.: Neural ODE Models in Large-Sample Hydrology, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6466, https://doi.org/10.5194/egusphere-egu23-6466, 2023.

EGU23-7347 | ECS | Orals | HS3.3

Deep learning for mapping water bodies in the Sahel

Mathilde de FLEURY, Laurent Kergoat, Martin Brandt, Rasmus Fensholt, Ankit Kariryaa, Gyula Mate Kovács, Stéphanie Horion, and Manuela Grippa

Inland surface water, especially lakes and small water bodies, are essential resources and have impacts on biodiversity, greenhouse gases and health. This is particularly true in the semi-arid Sahelian region, where these resources remain largely unassessed, and little is known about their number, size and quality. Remote sensing monitoring methods remain a promising tool to address these issues at the large scale, especially in areas where field data are scarce. Thanks to technological advances, current remote sensing systems provide data for regular monitoring over time and offer a high spatial resolution, up to 10 metres.

Several water detection methods have been developed, many of them using spectral information to differentiate water surfaces from soil, through thresholding on water indices (MNDWI for example), or classifications by clustering. These methods are sensitive to optical reflectance variability and are not straight forwardly applicable to regions, such as the Sahel, where the lakes and their environment are very diverse. Particularly, the presence of aquatic vegetation is an important challenge and source of error for many of the existing algorithms and available databases.

Deep learning, a subset of machine learning methods for training deep neural networks, has emerged as the state-of-the-art approach for a large number of remote sensing tasks. In this study, we apply a deep learning model based on the U-Net architecture to detect water bodies in the Sahel using Sentinel-2 MSI data, and 86 manually defined lake polygons as training data. This framework was originally developed for tree mapping (Brandt et al., 2020, https://doi.org/10.1038/s41586-020-2824-5).

Our preliminary analysis indicate that our models achieve a good accuracy (98 %). The problems of aquatic vegetation do not appear anymore, and each lake is thus well delimited irrespective of water type and characteristics. Using the water delineations obtained, we then classify different optical water types and thereby highlight different type of waterbodies, that appear to be mostly turbid and eutrophic waters, allowing to better understand the eco-hydrological processes in this region.

This method demonstrates the effectiveness of deep learning in detecting water surfaces in the study region. Deriving water masks that account for all kind of waterbodies offer a great opportunity to further characterize different water types. This method is easily reproducible due to the availability of the satellite data/algorithm and can be further applied to detect dams and other human-made features in relation to lake environments.

How to cite: de FLEURY, M., Kergoat, L., Brandt, M., Fensholt, R., Kariryaa, A., Kovács, G. M., Horion, S., and Grippa, M.: Deep learning for mapping water bodies in the Sahel, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7347, https://doi.org/10.5194/egusphere-egu23-7347, 2023.

EGU23-7828 | ECS | Posters on site | HS3.3

Sub-seasonal daily precipitation forecasting based on Long Short-Term Memory (LSTM) models

Claudia Bertini, Gerald Corzo, Schalk Jan van Andel, and Dimitri Solomatine

Water managers need accurate rainfall forecasts for a wide spectrum of applications, ranging from water resources evaluation and allocation, to flood and drought predictions. In the past years, several frameworks based on Artificial Intelligence have been developed to improve the traditional Numerical Weather Prediction (NWP) forecasts, thanks to their ability of learning from past data, unravelling hidden relationships among variables and handle large amounts of inputs. Among these approaches, Long Short-Term Memory (LSTM) models emerged for their ability to predict sequence data, and have been successfully used for rainfall and flow forecasting, mainly with short lead-times. In this study, we explore three different multi-variate LSTM-based models, i.e. vanilla LSTM, stacked LSTM and bidirectional LSTM, to forecast daily precipitation for the upcoming 30 days in the area of Rhine Delta, the Netherlands. We use both local atmospheric and global climate variables from the ERA-5 reanalysis dataset to predict rainfall, and we introduce a fuzzy index for the models to account for seasonality effects. The framework is developed within the H2020 project CLImate INTelligence (CLINT), and its outcomes have the potential to improve forecasting precipitation deficit in the study area.

How to cite: Bertini, C., Corzo, G., van Andel, S. J., and Solomatine, D.: Sub-seasonal daily precipitation forecasting based on Long Short-Term Memory (LSTM) models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7828, https://doi.org/10.5194/egusphere-egu23-7828, 2023.

EGU23-8218 | ECS | Orals | HS3.3

Filling Temporal Gaps within and between GRACE and GRACE-FO Terrestrial Water Storage Changes over Indian Sub-Continent using Deep Learning.

Pragay Shourya Moudgil and G Srinivasa Rao

Terrestrial water storage (TWS) anomalies from Gravity Recovery and Climate Experiment (GRACE) and its follow on GRACE-FO satellite missions provide a unique opportunity to measure the impact of different climate extremes and human intervention on water use at regional and continental scales. However, temporal gaps within GRACE and GRACE-FO mission (GRACE: 20 months, between GRACE and GRACE-FO: 11 months and GRACE-FO: 2 months) pose difficulties in analyzing spatiotemporal variations in TWS. In this study, Convolutional Long Short-Term Memory Neural Networks (CNN-LSTM) model was developed to fill these gaps and reconstruct the TWS for the Indian subcontinent (April 2002-July 2022). Various meteorological and climatic variables, such as precipitation, temperature, run-off, evapotranspiration, and vegetation, have been integrated to predict GRACE TWS. The performance of the models was evaluated with the help of Pearson’s correlation coefficient (PR), Nash-Sutcliffe efficiency (NSE), and Normalised Root Mean Square Error (NRMSE). Results indicate that the CNN-LSTM model yielded a mean PR of 0.94 and 0.89, NSE of 0.87 and 0.8, and NRMSE of 0.075 and 0.101 on training and testing, respectively. Overall, the CNN-LSTM achieved good performance except in the northwestern region of India, which showed a relatively poor performance might be due to high anthropogenic activity and arid climatic conditions. Further reconstructed time series were used to study the Spatiotemporal variations of TWS over the Indian Subcontinent.

Keywords: GRACE; Deep Learning; TWSA; Indian subcontinent

How to cite: Moudgil, P. S. and Rao, G. S.: Filling Temporal Gaps within and between GRACE and GRACE-FO Terrestrial Water Storage Changes over Indian Sub-Continent using Deep Learning., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8218, https://doi.org/10.5194/egusphere-egu23-8218, 2023.

EGU23-8746 | ECS | Posters on site | HS3.3

A large sample study of the effects of upstream hydrometeorological input features for LSTM-based daily flow forecasting in Canadian catchments

Everett Snieder and Usman Khan

Recent years have seen an increase of deep learning applications for flow forecasting. Large-sample hydrological (LSH) studies typically try to predict the runoff of a catchment using some selection of hydrometeorological features from the respective catchment. One aspect of these models that has received little attention in LSH is the effect that data from upstream catchments has on model performance. The number of available and stations and distance between stations is highly variable between catchments, which creates a unique modelling challenge. Existing LSH studies either use some form of linear aggregation of upstream flows as input features or omit them altogether. The potential of upstream data to improve the performance of real-time flow forecasts has not yet been systematically evaluated on a large scale. The objective of our study is to evaluate methods for integrating upstream features for real-time, data-driven flow forecasting models. Our study uses a subset of Canadian catchments (n>150) from the HYSETS database. For each catchment, long-short term memory networks (LSTMs) are used to generate flow forecasts for lead times of 1 to 3 days. We evaluate methods for identifying, selecting, and integrating relevant upstream input features within a deep-learning modelling framework, which include using neighbouring upstream stations, using all upstream stations, and using all stations with embedded dimensionality reduction. Early results indicate that while the inclusion of upstream data often yields improvements in model performance, including too much upstream information can easily have detrimental effects.

How to cite: Snieder, E. and Khan, U.: A large sample study of the effects of upstream hydrometeorological input features for LSTM-based daily flow forecasting in Canadian catchments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8746, https://doi.org/10.5194/egusphere-egu23-8746, 2023.

EGU23-9726 | ECS | Posters on site | HS3.3

Flood Forecasting with Deep Learning LSTM Networks: Local vs. Regional Network Training Based on Hourly Data

Tanja Morgenstern, Jens Grundmann, and Niels Schütze

Floods are among the most frequently occurring natural disasters in Germany. Therefore, predicting their occurrence is a crucial task for efficient disaster management and for the protection of life, property, infrastructure and cultural assets. In recent years Deep Learning methods gained popularity on the research field on flood forecasting methods – Long Short-Term Memory (LSTM) networks being part of them.

Efficient disaster management needs a fine temporal resolution of runoff predictions. Past work at TU Dresden on LSTM networks shows certain challenges when using input data with hourly resolution, such as systematically poor timing in peak flow prediction (Pahner et al. (2019) and Morgenstern et al. (2021)). At times, disaster management even requires flood forecasts for hitherto unobserved catchments, so in total a regionally transferable rainfall-runoff model with a fine temporal resolution is needed. We derived the idea for a potential approach from Kratzert et al. (2019) and Fang et al. (2021): they demonstrate that LSTM networks for rainfall(R)-runoff(R)-modeling benefit from an integration of multiple diverse catchments in the training dataset instead of a strictly local dataset, as this allows the networks to learn universal hydrologic catchment behavior. However, their training dataset consists of daily resolution data.

Following this approach, in this study we train the LSTM networks using single catchments ("local network training") as well as combinations of diverse catchments in Saxony, Germany ("regional network training"). The training data (hourly resolution) consist of area averages of observed precipitation as well as of observed discharge at long-term observation gauges in Saxony. The gauges belong to small, fast-responding Saxon catchments and vary in their hydrological and geographical properties, which in turn are part of the network training as well.

We show the preliminary results and investigate the following questions:

With a finer temporal resolution than daily values, characteristics of flood waves become more pronounced. Concerning the detailed simulation of flood waves, do regional LSTM-based R-R-models enable more accurate and robust flow predictions compared to local LSTM-based R-R-models – especially for rare extreme events?
Are regional LSTM-based R-R-models – trained at this temporal resolution – able to generalize to unobserved areas or areas with discharge observations unsuitable for network training?

References

Fang, K., Kifer, D., Lawson, K., Feng, D., Shen, C. (2022). The Data Synergy Effects of Time-Series Deep Learning Models in Hydrology. In: Water Resources Research (58). DOI: 10.1029/2021WR029583

Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., Nearing, G. (2019). Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets. Hydrology and Earth System Sciences (23), S. 5089–5110. DOI: 10.5194/hess-23-5089-2019

Morgenstern, T., Pahner, S., Mietrach, R., Schütze, N. (2021): Flood forecasting in small catchments using deep learning LSTM networks. DOI: 10.5194/egusphere-egu21-15072

Pahner, S., Mietrach, R., Schütze, N. (2019): Flood Forecasting in small catchments: a comparative application of long short-term memory networks and artificial neural networks. DOI: 10.13140/RG.2.2.36770.89286.

How to cite: Morgenstern, T., Grundmann, J., and Schütze, N.: Flood Forecasting with Deep Learning LSTM Networks: Local vs. Regional Network Training Based on Hourly Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9726, https://doi.org/10.5194/egusphere-egu23-9726, 2023.

EGU23-10317 | ECS | Posters on site | HS3.3

A convolutional LSTM model with high accuracy to predict extreme precipitation space-time fields

Hyojeong Choi and Dongkyun Kim

Precipitation forecast models based on meteorological radar data using machine learning architectures accurately predict spatio-temporal progress of precipitation. However, these data-driven forecasting models tend to underestimate magnitude of extreme precipitation events because the training of them is based on the observed precipitation data in which the normal precipitation events are included significantly more than the rare extreme events. This study proposes a ConvLSTM-based precipitation nowcasting model that can accurately predict space-time field of extreme precipitation. First, precipitation events were classified into 5 subsets using the k-means clustering algorithm based their statistical properties such as mean, standard deviation, skewness, duration, and the calendar month at which the precipitation event occurred. Then, a ConvLSTM-based neural network was trained based on the subset containing extreme precipitation events (events with large mean, variance, and duration occurred in summer months). The model was trained and tested based on the 4km-10minute resolution radar-gauge composite precipitation field of central part of South Korea (200km x 200km) for the period of 2009-2015 and 2016-2020, respectively. The NSE of the model that was trained based on the whole precipitation data was 0.55 while the one trained based on the subset of extreme precipitation was 0.78 showing a significant improvement.

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. NRF-2021R1A2C2003471).

How to cite: Choi, H. and Kim, D.: A convolutional LSTM model with high accuracy to predict extreme precipitation space-time fields, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10317, https://doi.org/10.5194/egusphere-egu23-10317, 2023.

EGU23-12315 | Posters on site | HS3.3

Meta-modeling with data-driven methods in hydrology

Tobias Krueger, Mark Somogyvari, Ute Fehrenbach, and Dieter Scherer

Process-based models are the standard tools today when trying to understand how physical systems work. There are situations however, when system understanding is not a primary focus and it is worth substituting existing process-based models with computationally more efficient meta-models (or emulators), i.e. proxies designed for specific applications. In our research we have explored potential data-driven meta-modeling approaches for applications in hydrology, designed to solve specific research questions.

In order to find a suitable meta-modeling approach, we have experimented with a set of different data-driven methods. We have employed a multi-fidelity modeling approach, where we gradually increased the complexity of our models. In total five different approaches were investigated: linear model with ordinary least squares regression, linear model with two different Bayesian methods (Hamiltonian Monte Carlo and transdimensional Monte Carlo) and two machine learning approaches (dense artificial neural network and long short-term memory (LSTM) neural network).

For method development the project case study of the Groß Glienicker Lake was used. This is a glacial lake near Berlin, with a strong negative trend in water levels in the last decades. Supported by the observation model from the Central European Refined analysis, we had a daily, high resolution meteorological dataset (precipitation and actual evapotranspiration) and lake level observations for 16 years.

All of the used models are designed similarly: they predict lake level changes one day ahead using precipitation and evapotranspiration data from the previous 70 days. This interval was selected after an extensive parameter test with the linear model. By predicting the change in stored water, we linearize the problem, and by using a longer time interval we allow the methods to automatically compensate for any lag or memory effects inside the catchment. The different methods are evaluated by comparing the fits between the observed and the reconstructed lake levels.

As expected, increasing the model and inversion complexity improves the quality of the reconstruction. Especially the use of nonlinear models was advantageous, the artificial neural network outperformed every other method. However, in the used example these improvements were relatively small – meaning that in practice the simplest linear method was advantageous due to its computational efficiency and robustness, and ease of use and interpretation.

In this presentation we discuss the challenges of data preparation and optimal model design (especially the memory of the hydrological system), while finding the hyperparameters of the specific methods themselves was relatively straight forward. Our results suggest that problem linearization should be a preferred first step in any meta-modeling application, as it helps the training of nonlinear models as well. We also discuss data requirements, because we found that the size of our dataset was too small for the most complex LSTM method, which yielded unstable results and learned spurious background trends.

How to cite: Krueger, T., Somogyvari, M., Fehrenbach, U., and Scherer, D.: Meta-modeling with data-driven methods in hydrology, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12315, https://doi.org/10.5194/egusphere-egu23-12315, 2023.

EGU23-12952 | ECS | Orals | HS3.3

On the generalization of hydraulic-inspired graph neural networks for spatio-temporal flood simulations

Roberto Bentivoglio, Elvin Isufi, Sebastian Nicolaas Jonkman, and Riccardo Taormina

The high computational cost of detailed numerical models for flood simulation hinders their use in real-time and limits uncertainty quantification. Deep-learning surrogates have thus emerged as an alternative to speed up simulations. However, most surrogate models currently work only for a single topography, meaning that they need to be retrained for different case studies, ultimately defeating their purpose. In this work, we propose a graph neural network (GNN) inspired by the shallow water equations used in flood modeling, that can generalize the spatio-temporal prediction of floods over unseen topographies. The proposed model works similarly to finite volume methods by propagating the flooding in space and time, given initial and boundary conditions. Following the Courant-Friedrichs-Lewy condition, we link the time step between consecutive predictions to the number of GNN layers employed in the model. We analyze the model's performance on a dataset of numerical simulations of river dike breach floods, with varying topographies and breach locations. The results suggest that the GNN-based surrogate can produce high-fidelity spatio-temporal predictions, for unseen topographies, unseen breach locations, and larger domain areas with respect to the training ones, while reducing computational times.

How to cite: Bentivoglio, R., Isufi, E., Jonkman, S. N., and Taormina, R.: On the generalization of hydraulic-inspired graph neural networks for spatio-temporal flood simulations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12952, https://doi.org/10.5194/egusphere-egu23-12952, 2023.

EGU23-13493 | ECS | Posters on site | HS3.3

Comparison of a conceptual rainfall-runoff model with an artificial neural network model for streamflow prediction

fadil boodoo, carole delenne, Renaud hostache, and julien freychet

Accurate streamflow forecasting can help minimizing the negative impacts of hydrological events such as floods and droughts. To address this challenge, we explore here artificial neural networks models (ANNs) for streamflow forecasting. These models, which have been proven successful in other fields, may offer improved accuracy and efficiency compared to traditional conceptually-based forecasting approaches.

The goal of this study is to compare the performance of a traditional conceptual rainfall-runoff (hydrological) model with an artificial neural network (ANN) model for streamflow forecasting. As a test case, we use the Severn catchment in the United Kingdom. The adopted ANN model has a long short-term memory (LSTM) architecture with two hidden layers, each with 256 neurons. The model is trained on a 25-year dataset from 1988 to 2013 and tested on a 3-year dataset (from 2014 to 2016). It is also validated on a 3-year dataset (from 2017 to 2020, 2019 being a particularly wet year), to assess its performance in extreme hydrological conditions. The study focuses on daily and hourly predictions.

To conduct this study, the conceptual hydrological model called Superflex is used as a benchmark. Both models are first evaluated using the Nash-Sutcliffe Efficiency (NSE) score. To enable a fair and accurate comparison, both models share the same inputs (i.e. meteorological forcings: total precipitation, daily maximum and minimum temperatures, daylight duration, mean surface downward short wave radiation flux, and vapor pressure). The ANN model was implemented using the Neuralhydrology library developed by F. Kratzert.

In our study, we found that LSTM model is able to provide more accurate one-day forecasts than the hydrological model Superflex. For the daily predictions, the average NSE score using the LSTM model is 0.85 (with an average NSE score of 0.99 for training period, and 0.85 for validation period), which is higher than the NSE score of 0.74 achieved by the Superflex model (with a score of 0.84 for training period).

The hourly prediction using NSE with the superflex model had a score of 0.88, with a score of 0.7 during training. The LSTM model had an average NSE score of 0.87, with an average score of 0.99 during training and an average score of 0.85 during validation.

These results were obtained without adjusting the hyperparameters and by training the model only on data from the Severn watershed.The ANN model has demonstrated promising results compared to a state-of-the-art conceptual hydrological model in our studies. We will further compare both models using different training dataset periods, and different catchements. These additional tests will provide more information on the capabilities of the LSTM model and help to confirm its effectiveness.

How to cite: boodoo, F., delenne, C., hostache, R., and freychet, J.: Comparison of a conceptual rainfall-runoff model with an artificial neural network model for streamflow prediction, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13493, https://doi.org/10.5194/egusphere-egu23-13493, 2023.

EGU23-14399 | ECS | Orals | HS3.3

LSTMs for Hydrological Modelling in Swiss Catchments

Christina Lott, Leonardo Martins, Jonas Weiss, Thomas Brunschwiler, and Peter Molnar

Simulation of the catchment rainfall-runoff transformation with physically based watershed models is a traditional way to predict streamflow and other hydrological variables at catchment scales. However, the calibration of such models requires large data inputs and computational power and contains many parameters which are often impossible to constrain or validate. An alternative approach is to use data-driven machine learning for streamflow prediction.

In the past few years, LSTM (long short-term memory) models and its variants have been explored in rainfall-runoff modelling. Typical applications use daily climate variables as inputs and model the rainfall-runoff transformation processes with different timescales of memory. This is especially useful as delays in runoff production by snow accumulation and melt, soil water storage, evapotranspiration, etc., can be included. In contrast to feed-forward ANNs (artificial neural networks), LSTMs are capable of maintaining the sequential temporal order of inputs, and compared to RNNs (recurrent neural networks), of learning the long-term dependencies. [1]

However, current work on LSTMs mostly focuses on the USA, the UK and Brazil, where CAMELS datasets are available [1, 2, 3]. Catchments at higher altitudes with snow-driven dynamics and sometimes glaciers are present in small number in these datasets (if at all). Systematic applications of LSTMs for streamflow prediction in climates where a significant part of the catchments are snow and ice dominated are missing. In this work, an FS-LSTM (fast slow-LSTM) previously applied in Brazil is adapted for Swiss catchments to fill this gap [3]. The FS-LSTM explored builds on the work of Hoedt et al. (2021) that imposed mass constraints on an LSTM, called MC-LSTM [4]. FS-LSTM adds a fast and slow part for streamflow, containing rainfall and soil moisture respectively. We will discuss benchmark results against an existing semi-distributed conceptual model widely used in Switzerland for streamflow simulation [5].

References:

[1]: Kratzert et al., Rainfall-runoff modelling using Long Short-Term Memory (LSTM) networks, 2018.

[2]: Lees et al., Hydrological concept formation inside long short-term memory (LSTM) networks, 2022.

[3]: Quinones et al., Fast-Slow Streamflow Model Using Mass-Conserving LSTM, 2021.

[4]: Hoedt et al., MC-LSTM: Mass-Conserving LSTM, 2021.

[5]: Viviroli et al., An introduction to the hydrological modelling system PREVAH and its pre- and post-processing-tools, 2009.

How to cite: Lott, C., Martins, L., Weiss, J., Brunschwiler, T., and Molnar, P.: LSTMs for Hydrological Modelling in Swiss Catchments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14399, https://doi.org/10.5194/egusphere-egu23-14399, 2023.

EGU23-14631 | ECS | Orals | HS3.3

Deep learning based coordinates transformations for improving process understanding in hydrological modeling system

Xinqi Hu, Ye Tuo, and Markus Disse

Improving the understanding of processes is vital to hydrological modeling. One key challenge is how to extract interpretable information that can describe the complex hydrological system from the growing number of observation data to advance our understanding of processes and modeling. To address the problem, we propose a data-driven framework to discover coordinate transformation, which transfers original observations to a reduced-dimension system. The framework combines deep learning method with sparse regression to approximate the specific hydrological process: deep learning methods have a rich representation to promote generalization, and sparse regression can sparsely identify parsimonious models to promote interpretability. By doing so, we can identify the essential latent variables under a physically meaning-wise coordinate system where the hydrological processes are linearly and sparsity represented to capture the behavior of the system from observations. To demonstrate the framework, we focus on the evaporation process. The relationships between potential evaporation and climate variables including long/short wave radiation, air temperature, air pressure, relative humidity, and wind speed are quantified. The connection between the climate variables and coordinates components extracted are evaluated to capture the pattern of climate variables in the component space. The robustness and statistical stability of the framework is examined based on distributed observations from FluxNet towers over North America. The resulting modeling framework shows the potential of deep learning methods for improving our knowledge of the hydrological system.

How to cite: Hu, X., Tuo, Y., and Disse, M.: Deep learning based coordinates transformations for improving process understanding in hydrological modeling system, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14631, https://doi.org/10.5194/egusphere-egu23-14631, 2023.

EGU23-15575 | Orals | HS3.3

Application of deep convolutional neural networks for precipitation estimation through both top-down and bottom-up approaches

Hamidreza Mosaffa, Paolo Filippucci, Luca Ciabatta, Christian Massari, and Luca Brocca

Reliable and accurate precipitation estimations are a crucial hydrological parameter for various applications, including managing water resources, drought monitoring and natural hazard prediction. The two main approaches for estimating precipitation from satellite data are the top-down and bottom-up. The top-down approach uses data from Geostationary and Low Earth Orbiting satellites to infer precipitation from atmosphere and cloud information, while the bottom-up approach estimates precipitation using soil moisture observations, e.g. the SM2RAIN algorithm. The main difference between these approaches is that the top-down approach is a more direct method of measuring precipitation that estimates it instantaneously, which may lead to underestimation, while the bottom-up approach measures accumulated rainfall with more reliable precipitation estimation between two consecutive SM measurements. In this study, we develop the deep convolutional neural networks (CNN) algorithm to combine the top-down and bottom-up approaches for estimating precipitation using the satellite level 1 products including the satellite backscatter information from the Advanced SCATterometer (ASCAT), infrared (IR) and water vapor (WV) channels from geostationary satellites. This algorithm is assessed at 0.1° spatial and daily temporal resolution over Italy for the period of 2019-2021. The results show that the developed model improves the accuracy of precipitation estimation. Additionally, it indicates that there is a significant potential for global precipitation estimation using this model.

How to cite: Mosaffa, H., Filippucci, P., Ciabatta, L., Massari, C., and Brocca, L.: Application of deep convolutional neural networks for precipitation estimation through both top-down and bottom-up approaches, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15575, https://doi.org/10.5194/egusphere-egu23-15575, 2023.

EGU23-15604 | ECS | Posters on site | HS3.3

Forecasting discharges through explainable machine learning approaches at an alpine karst spring

Anna Pölz, Julia Derx, Andreas Farnleitner, and Alfred Paul Blaschke

Karst springs provide drinking water for approximately 700 million people worldwide. Complex subsurface flow processes lead to challenges for modelling spring discharges. Machine learning (ML) models possess the ability to learn non-linear patterns and show promising results in forecasting dynamic spring discharge. We compare the performance of three ML models of varying complexity in forecasting karst spring discharges: the multivariate adaptive regression spline model (MARS), a feed-forward neural network (ANN) and a long short-term memory model (LSTM). The well-studied alpine karst spring LKAS2 in Austria is used as test case. We provide model explanations including feature attribution through Shapley additive explanations (SHAP), a method based on Shapley values. Our results show that the higher the model complexity, the higher the accuracy, based on the evaluated symmetric mean absolute percentage error of the three investigated models. With SHAP every prediction can be explained through each feature in each input time step. We found seasonal model differences. For example, snow influenced the model mostly in winter and spring. Analyzing the combinations of input time steps and features provided further insights into the model performance. For instance, the SHAP results showed that a high electrical conductivity in recent time steps, which indicates that the karst water is less diluted with precipitation, leads to a reduced discharge forecast. These feature attribution results coincide with physical processes within karst systems. Therefore, the introduced SHAP method can increase the confidence in ML model forecasts and emphasizes the raison d’être of complex and accurate deep learning models in hydrology. This allows the operator to better understand and evaluate the model’s output, which is essential for drinking water management.

How to cite: Pölz, A., Derx, J., Farnleitner, A., and Blaschke, A. P.: Forecasting discharges through explainable machine learning approaches at an alpine karst spring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15604, https://doi.org/10.5194/egusphere-egu23-15604, 2023.

EGU23-15629 | ECS | Posters virtual | HS3.3

Peak Hydrological Event Simulation with Deep Learning Algorithm

Nicole Tatjana Scherer, Muhammad Nabeel Usmann, Markus Disse, and Jingshui Huang

Most floods are caused by heavy rainfall events, including the disaster in the Simbach catchment in 2016. For the Simbach catchment, a study was already carried out using the conceptual Hydrologiska Byråns Vattenbalansavdelning (HBV) model to simulate the extreme event of 2016. While the calibration model performance is classified as very good, the overall validation is classified as unsatisfactory. Recent studies showed that data-driven models outperform benchmark rainfall-runoff models. A widely used data-driven model is the Long-Short-Term-Memory algorithm (LSTM). The main advantage of this algorithm is the ability to learn short-term as well as long term dependencies.

The objective of this work is to determine if a data-driven model outperforms the conceptual model. For this purpose, in a first step a LSTM model is setup and its results are compared with the results of the HBV model. It is assumed that the LSTM model outperforms the HBV model in training and validation but is not able to simulate the extreme event, because the extrapolation capabilities of Neuronal Networks are poor if they operate outside of their training range. In a second step, it is studied if the model performance can be improved by providing more features to the model. Therefore, different feature combinations are provided to the model. Furthermore, it is assumed that providing more data to the model will improve its performance. Therefore, in a third step more events are used for training and validation.

It was concluded that the LSTM model is able to simulate the rainfall-runoff process. A satisfactory overall model performance can be achieved using only precipitation as input data and a small training dataset of four events. But, as the HBV model, the LSTM model is not able to simulate the extreme event, because no extreme event is present within the training dataset. However, the LSTM model outperforms the HBV model, because the LSTM generalizes better. Furthermore, the model performance of the LSTM model using six events can be improved by providing additionally the soil moisture class as input data. Whereas providing more features to the model results in worse model performance. Providing more events to the model does not significantly improve its performance. However, the model improved especially for the event in June 2015. If the model is trained with more events having higher magnitude than the 2015 event, the event in 2015 is no longer classified as an out-of-sample event, resulting in better model performance. Providing the model more events and more input features does not significantly improve the model performance.

The results show the potential and limitations using the LSTM model in modeling extreme events.

How to cite: Scherer, N. T., Usmann, M. N., Disse, M., and Huang, J.: Peak Hydrological Event Simulation with Deep Learning Algorithm, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15629, https://doi.org/10.5194/egusphere-egu23-15629, 2023.

EGU23-16658 | ECS | Orals | HS3.3

Improving large-basin streamflow simulation using a modular, differentiable, learnable graph model for routing

Tadd Bindas, Wen-Ping Tsai, Jiangtao Liu, Farshid Rahmani, Dapeng Feng, Yuchen Bian, Kathryn Lawson, and Chaopeng Shen

Differentiable modeling has been introduced recently as a method to learn relationships from a combination of data and structural priors. This method uses end-to-end gradient tracking inside a process-based model to tune internal states and parameters along with neural networks, allowing us to learn underlying processes and spatial patterns. Hydrologic routing modules are typically needed to simulate flows in stem rivers downstream of large, heterogeneous basins, but obtaining suitable parameterization for them has previously been difficult. In this work, we apply differentiable modeling in the scope of streamflow prediction by coupling a physically-based routing model (which computes flow velocity and discharge in the river network given upstream inflow conditions) to neural networks which provide parameterizations for Manning’s river roughness parameter (n). This method consists of an embedded Neural Network (NN), which uses (imperfect) DL-simulated runoffs and reach-scale attributes as forcings and inputs, respectively, entered into the Muskingum-Cunge method and trained solely on downstream discharge. Our initial results show that while we cannot identify channel geometries, we can learn a parameterization scheme for roughness that follows observed n trends. Training on a short sample of observed data showed that we could obtain highly accurate routing results for the training and inner, untrained gages. This general framework can be applied to small and large scales to learn channel roughness and predict streamflow with heightened interpretability.

How to cite: Bindas, T., Tsai, W.-P., Liu, J., Rahmani, F., Feng, D., Bian, Y., Lawson, K., and Shen, C.: Improving large-basin streamflow simulation using a modular, differentiable, learnable graph model for routing, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16658, https://doi.org/10.5194/egusphere-egu23-16658, 2023.

EGU23-16947 | ECS | Orals | HS3.3

A differentiable modeling approach to systematically integrating deep learning and physical models for large-scale hydrologic prediction and knowledge discovery

Dapeng Feng and Chaopeng Shen

Although deep learning (DL) models have shown extraordinary performance in hydrologic modeling, they are still hard to interpret and not able to predict untrained hydrologic variables due to lacking physical meanings and constraints. This study established hybrid differentiable models (namely the delta models) with regionalized parameterization and learnable structures based on a DL-based differentiable parameter learning (dPL) framework. The simulation experiments on both US and global basins demonstrate that the delta models can approach the performance of the state-of-the-art long short-term memory (LSTM) network on discharge prediction. Different from the pure data-driven LSTM model, the delta models can output a full set of hydrologic variables not used as training targets. The evaluation with independent data sources showed that the delta models, only trained on discharge observations, can also give decent predictions for ET and baseflow. The spatial extrapolation experiments showed that the delta models can surpass the performance of the LSTM model for predictions in large ungauged regions in terms of the daily hydrographic metrics and multi-year trend prediction. The spatial patterns of the parameters learned by the delta models remain remarkably stable from the in-sample to spatial out-of-sample predictions, which explains the robustness of the delta models for spatial extrapolation. More importantly, the proposed modeling framework enables directly learning new relations between intermediate variables from large observations. This study shows that the model performance and physical meanings can be balanced with the differentiable modeling approach which is promising to large-scale hydrologic prediction and knowledge discovery.

How to cite: Feng, D. and Shen, C.: A differentiable modeling approach to systematically integrating deep learning and physical models for large-scale hydrologic prediction and knowledge discovery, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16947, https://doi.org/10.5194/egusphere-egu23-16947, 2023.

EGU23-16974 | Orals | HS3.3

From Hindcast to Forecast with Deep Learning Streamflow Models

Grey Nearing, Martin Gauch, Daniel Klotz, Frederik Kratzert, Asher Metzger, Guy Shalev, Shlomo Shenzis, Tadele Tekalign, Dana Weitzner, and Oren Gilon

Deep learning has become the de facto standard for streamflow simulation. While there are examples of deep learning based streamflow forecast models (e.g., 1-5), the majority of the development and research has been done with hindcast models. The primary challenge in using deep learning models for forecasting (e.g., flood forecasting) is that the meteorological input data are drawn from different distributions in hindcast vs. forecast. The (relatively small) amount of research that has been done on deep learning streamflow forecasting has largely used an encoder-decoder approach to account for forecast distribution shifts. This is, for example, what Google’s operational flood forecasting model uses [4].

In this work we show that the encoder-decoder approach results in artifacts in forecast trajectories that are not detectable with standard hydrological metrics, but which can cause forecasts to have incorrect trends (e.g., rising when they should be falling and vice-versa). We solve this problem using regularized embeddings, which remove forecast artifacts without harming overall accuracy.

Perhaps more importantly, input embeddings allow for training models on spatially and/or temporally incomplete meteorological inputs, meaning that a single model can be trained using input data that does not exist everywhere or does not exist during the entire training or forecast period. This allows models to learn from a significantly larger training data set, which is important for high-accuracy predictions. It also allows large (e.g., global) models to learn from local weather data. We demonstrate how and why this is critical for state-of-the-art global-scale streamflow forecasting.

Franken, Tim, et al. An operational framework for data driven low flow forecasts in Flanders. No. EGU22-6191. Copernicus Meetings, 2022.
Kao, I-Feng, et al. "Exploring a Long Short-Term Memory based Encoder-Decoder framework for multi-step-ahead flood forecasting." Journal of Hydrology 583 (2020): 124631.
Liu, Darong, et al. "Streamflow prediction using deep learning neural network: case study of Yangtze River." IEEE access 8 (2020): 90069-90086.
Nevo, Sella, et al. "Flood forecasting with machine learning models in an operational framework." Hydrology and Earth System Sciences 26.15 (2022): 4013-4032.
Girihagama, Lakshika, et al. "Streamflow modelling and forecasting for Canadian watersheds using LSTM networks with attention mechanism." Neural Computing and Applications 34.22 (2022): 19995-20015.

How to cite: Nearing, G., Gauch, M., Klotz, D., Kratzert, F., Metzger, A., Shalev, G., Shenzis, S., Tekalign, T., Weitzner, D., and Gilon, O.: From Hindcast to Forecast with Deep Learning Streamflow Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16974, https://doi.org/10.5194/egusphere-egu23-16974, 2023.

EGU23-582 | ECS | Posters on site | ITS1.13/AS5.2

Modeling the Variability of Terrestrial Carbon Fluxes using Transformers

Swarnalee Mazumder and Ayush Prasad

The terrestrial carbon cycle is one of the largest sources of uncertainty in climate projections. The terrestrial carbon sink which removes a quarter of anthropogenic CO2 emissions; is highly variable in time and space depending on climate. Previous studies have found that data-driven models such as random forest, artificial neural networks and long short-term memory networks can be used to accurately model Net Ecosystem Exchange (NEE) and Gross Primary Productivity (GPP) accurately, which are two important metrics to quantify the direction and magnitude of CO2 transfer between the land surface and the atmosphere. Recently, a new class of machine learning models called transformers have gained widespread attention in natural language processing tasks due to their ability to learn from large volumes of sequential data. In this work, we use Transformers to model NEE and GPP from 1996-2022 at 39 Flux stations in the ICOS Europe network using ERA5 reanalysis data. We can compare our results with traditional machine learning approaches to evaluate the generalisability and predictive performance of transformers for carbon flux modelling.

How to cite: Mazumder, S. and Prasad, A.: Modeling the Variability of Terrestrial Carbon Fluxes using Transformers, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-582, https://doi.org/10.5194/egusphere-egu23-582, 2023.

EGU23-1825 | ECS | Orals | ITS1.13/AS5.2

Spatial representation learning for ensemble weather simulations using invariant variational autoencoders

Jieyu Chen, Kevin Höhlein, and Sebastian Lerch

Weather forecasts today are typically issued in the form of ensemble simulations based on multiple runs of numerical weather prediction models with different perturbations in the initial states and the model physics. In light of the continuously increasing spatial resolutions of operational weather models, this results in large, high-dimensional datasets that nonetheless contain relevant spatial and temporal structure, as well as information about the predictive uncertainty. We propose invariant variational autoencoder (iVAE) models based on convolutional neural network architectures to learn low-dimensional representations of the spatial forecast fields. We specifically aim to account for the ensemble character of the input data and discuss methodological questions about the optimal design of suitable dimensionality reduction methods in this setting. Thereby, our iVAE models extend previous work where low-dimensional representations of single, deterministic forecast fields were learned and utilized for incorporating spatial information into localized ensemble post-processing methods based on neural networks [1], which were able to improve upon model utilizing location-specific inputs only [2]. By additionally incorporating the ensemble dimension and learning representation for probability distributions of spatial fields, we aim to enable a more flexible modeling of relevant predictive information contained in the full forecast ensemble. Additional potential applications include data compression and the generation of forecast ensembles of arbitrary size.

We illustrate our methodological developments based on a 10-year dataset of gridded ensemble forecasts from the European Centre for Medium-Range Weather Forecasts of several meteorological variables over Europe. Specifically, we investigate alternative model architectures and highlight the importance of tailoring the loss function to the specific problem at hand.

References:

[1] Lerch, S. & Polsterer, K.L. (2022). Convolutional autoencoders for spatially-informed ensemble post-processing. ICLR 2022 AI for Earth and Space Science Workshop, https://arxiv.org/abs/2204.05102.

[2] Rasp, S. & Lerch, S. (2018). Neural networks for post-processing ensemble weather forecasts. Monthly Weather Review, 146, 3885-3900.

How to cite: Chen, J., Höhlein, K., and Lerch, S.: Spatial representation learning for ensemble weather simulations using invariant variational autoencoders, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1825, https://doi.org/10.5194/egusphere-egu23-1825, 2023.

EGU23-3117 | Orals | ITS1.13/AS5.2

AtmoRep: Large Scale Representation Learning for Atmospheric Data

Christian Lessig, Ilaria Luise, and Martin Schultz

The AtmoRep project asks if one can train one neural network that represents and describes all atmospheric dynamics. AtmoRep’s ambition is hence to demonstrate that the concept of large-scale representation learning, whose principle feasibility and potential was established by large language models such as GPT-3, is also applicable to scientific data and in particular to atmospheric dynamics. The project is enabled by the large amounts of atmospheric observations that have been made in the past as well as advances on neural network architectures and self-supervised learning that allow for effective training on petabytes of data. Eventually, we aim to train on all of the ERA5 reanalysis and, furthermore, fine tune on observational data such as satellite measurements to move beyond the limits of reanalyses.

We will present the theoretical formulation of AtmoRep as an approximate representation for the atmosphere as a stochastic dynamical system. We will also detail our transformer-based network architecture and the training protocol for self-supervised learning so that unlabelled data such as reanalyses, simulation outputs and observations can be employed for training and re-fining the network. Results will be presented for the performance of AtmoRep for downscaling, precipitation forecasting, the prediction of tropical convection initialization, and for model correction. Furthermore, we also demonstrate that AtmoRep has substantial zero-short skill, i.e., it is capable to perform well on tasks it was not trained for. Zero- and few-shot performance (or in context learning) is one of the hallmarks of large-scale representation learning and to our knowledge has never been demonstrated in the geosciences.

How to cite: Lessig, C., Luise, I., and Schultz, M.: AtmoRep: Large Scale Representation Learning for Atmospheric Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3117, https://doi.org/10.5194/egusphere-egu23-3117, 2023.

EGU23-3128 | ECS | Orals | ITS1.13/AS5.2 | Highlight

Improving global CMIP6 Earth system model precipitation output with generative adversarial networks for unpaired image-to-image translation

Philipp Hess, Stefan Lange, and Niklas Boers

Numerical Earth system models (ESMs) are our primary tool for projecting future climate scenarios. Their simulation output is used by impact models that assess the effect of anthropogenic global warming, e.g., on flood events, vegetation changes or crop yields. Precipitation, an atmospheric variable with arguably one of the largest socio-economic impacts, involves various processes on a wide range of spatial-temporal scales. However, these cannot be completely resolved in ESMs due to the limited discretization of the numerical model.
This can lead to biases in the ESM output that need to be corrected in a post-processing step prior to feeding ESM output into impact models, which are calibrated with observations [1]. While established post-processing methods successfully improve the modelled temporal statistics for each grid cell individually, unrealistic spatial features that require a larger spatial context are not addressed.
Here, we apply a cycle-consistent generative adversarial network (CycleGAN) [2] that is physically constrained to the precipitation output from Coupled Model Intercomparison Project phase 6 (CMIP6) ESMs to correct both temporal distributions and spatial patterns. The CycleGAN can be naturally trained on daily ESM and reanalysis fields that are unpaired due to the deviating trajectories of the ESM and observation-based ground truth.
We evaluate our method against a state-of-the-art bias adjustment framework (ISIMIP3BASD) [3] and find that it outperforms it in correcting spatial patterns and achieves comparable results on temporal distributions. We further discuss the representation of extreme events and suitable metrics for quantifying the realisticness of unpaired precipitation fields.

[1] Cannon, A.J., et al. "Bias correction of GCM precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes?." Journal of Climate 28.17 (2015): 6938-6959.

[2] Zhu, J.-Y., et al. "Unpaired image-to-image translation using cycle-consistent adversarial networks." Proceedings of the IEEE international conference on computer vision. 2017.

[3] Lange, S. "Trend-preserving bias adjustment and statistical downscaling with ISIMIP3BASD (v1.0)." Geoscientific Model Development 12.7 (2019): 3055-3070.

How to cite: Hess, P., Lange, S., and Boers, N.: Improving global CMIP6 Earth system model precipitation output with generative adversarial networks for unpaired image-to-image translation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3128, https://doi.org/10.5194/egusphere-egu23-3128, 2023.

EGU23-3256 | Orals | ITS1.13/AS5.2

Emulating radiative transfer in a numerical weather prediction model

Matthew Chantry, Peter Ukkonen, Robin Hogan, and Peter Dueben

EGU23-3321 | ECS | Orals | ITS1.13/AS5.2

Using machine learning to improve dynamical predictions in a coupled model

Zikang He, Julien Brajard, Yiguo Wang, Xidong Wang, and Zheqi Shen

Dynamical models used in climate prediction often have systematic errors that can bias the predictions. In this study, we utilized machine learning to address this issue. Machine learning was applied to learn the error corrected by data assimilation and thus build a data-driven model to emulate the dynamical model error. A hybrid model was constructed by combining the dynamical and data-driven models. We tested the hybrid model using synthetic observations generated by a simplified high-resolution coupled ocean-atmosphere model (MAOOAM, De Cruz et al., 2016) and compared its performance to that of a low-resolution version of the same model used as a standalone dynamical model.

To evaluate the forecast skill of the hybrid model, we produced ensemble predictions based on initial conditions determined through data assimilation. The results show that the hybrid model significantly improves the forecast skill for both atmospheric and oceanic variables compared to the dynamical model alone. To explore what affects short-term forecast skills and long-term forecast skills, we built two other hybrid models by correcting errors either only atmospheric or only oceanic variables. For short-term atmospheric forecasts, the results show that correcting only oceanic errors has no effect on atmosphere variables forecasts but correcting only atmospheric variables shows similar forecast skill to correcting both atmospheric and oceanic errors. For the long-term forecast of oceanic variables, correcting the oceanic error can improve the forecast skill, but correcting both atmospheric and oceanic errors can obtain the best forecast skill. The results indicate that for the long-term forecast of oceanic variables, bias correction of both oceanic and atmospheric components can have a significant effect.

How to cite: He, Z., Brajard, J., Wang, Y., Wang, X., and Shen, Z.: Using machine learning to improve dynamical predictions in a coupled model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3321, https://doi.org/10.5194/egusphere-egu23-3321, 2023.

EGU23-3340 | ECS | Orals | ITS1.13/AS5.2

An iterative data-driven emulator of an ocean general circulation model

Rachel Furner, Peter Haynes, Dan(i) Jones, Dave Munday, Brooks Paige, and Emily Shuckburgh

Data-driven models are becoming increasingly competent at tasks fundamental to weather and climate prediction. Relative to machine learning (ML) based atmospheric models, which have shown promise in short-term forecasting, ML-based ocean forecasting remains somewhat unexplored. In this work, we present a data-driven emulator of an ocean GCM and show that performance over a single predictive step is skilful across all variables under consideration. Iterating such data-driven models poses additional challenges, with many models suffering from over-smoothing of fields or instabilities in the predictions. We compare a variety of methods for iterating our data-driven emulator and assess them by looking at how well they agree with the underlying GCM in the very short term and how realistic the fields remain for longer-term forecasts. Due to the chaotic nature of the system being forecast, we would not expect any model to agree with the GCM accurately over long time periods, but instead we expect fields to continue to exhibit physically realistic behaviour at ever increasing lead times. Specifically, we expect well-represented fields to remain stable whilst also maintaining the presence and sharpness of features seen in both reality and in GCM predictions, with reduced emphasis on accurately representing the location and timing of these features. This nuanced and temporally changing definition of what constitutes a ‘good’ forecast at increasing lead times generates questions over both (1) how one defines suitable metrics for assessing data-driven models, and perhaps more importantly, (2) identifying the most promising loss functions to use to optimise these models.

How to cite: Furner, R., Haynes, P., Jones, D., Munday, D., Paige, B., and Shuckburgh, E.: An iterative data-driven emulator of an ocean general circulation model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3340, https://doi.org/10.5194/egusphere-egu23-3340, 2023.

EGU23-4337 | Orals | ITS1.13/AS5.2 | Highlight

Towards a new surrogate model for predicting short-term NOx-O3 effects from aviation using Gaussian processes

Pratik Rao, Richard Dwight, Deepali Singh, Jin Maruhashi, Irene Dedoussi, Volker Grewe, and Christine Frömming

While efforts have been made to curb CO₂ emissions from aviation, the more uncertain non-CO₂ effects that contribute about two-thirds to the warming in terms of radiative forcing (RF), still require attention. The most important non-CO₂ effects include persistent line-shaped contrails, contrail-induced cirrus clouds and nitrogen oxide (NO_x) emissions that alter the ozone (O₃) and methane (CH₄) concentrations, both of which are greenhouse gases, and the emission of water vapour (H₂O). The climate impact of these non-CO₂ effects depends on emission location and prevailing weather situation; thus, it can potentially be reduced by advantageous re-routing of flights using Climate Change Functions (CCFs), which are a measure for the climate effect of a locally confined aviation emission. CCFs are calculated using a modelling chain starting from the instantaneous RF (iRF) measured at the tropopause that results from aviation emissions. However, the iRF is a product of computationally intensive chemistry-climate model (EMAC) simulations and is currently restricted to a limited number of days and only to the North Atlantic Flight Corridor. This makes it impossible to run EMAC on an operational basis for global flight planning. A step in this direction lead to a surrogate model called algorithmic Climate Change Functions (aCCFs), derived by regressing CCFs (training data) against 2 or 3 local atmospheric variables at the time of emission (features) with simple regression techniques and are applicable only in parts of the Northern hemisphere. It was found that in the specific case of O₃ aCCFs, which provide a reasonable first estimate for the short-term impact of aviation NO_x on O₃ warming using temperature and geopotential as features, can be vastly improved [1]. There is aleatoric uncertainty in the full-order model (EMAC), stemming from unknown sources (missing features) and randomness in the known features, which can introduce heteroscedasticity in the data. Deterministic surrogates (e.g. aCCFs) only predict point estimates of the conditional average, thereby providing an incomplete picture of the stochastic response. Thus, the goal of this research is to build a new surrogate model for iRF, which is achieved by :

1. Expanding the geographical coverage of iRF (training data) by running EMAC simulations in more regions (North & South America, Eurasia, Africa and Australasia) at multiple cruise flight altitudes,

2. Following an objective approach to selecting atmospheric variables (feature selection) and considering the importance of local as well as non-local effects,

3. Regressing the iRF against selected atmospheric variables using supervised machine learning techniques such as homoscedastic and heteroscedastic Gaussian process regression.

We present a new surrogate model that predicts iRF of aviation NO_x-O₃ effects on a regular basis with confidence levels, which not only improves our scientific understanding of NO_x-O₃ effects, but also increases the potential of global climate-optimised flight planning.

References

[1] Rao, P.; et al. Case Study for Testing the Validity of NO_x-Ozone Algorithmic Climate Change Functions for Optimising Flight Trajectories. Aerospace 2022, 9, 231. https://doi.org/10.3390/aerospace9050231

How to cite: Rao, P., Dwight, R., Singh, D., Maruhashi, J., Dedoussi, I., Grewe, V., and Frömming, C.: Towards a new surrogate model for predicting short-term NOx-O3 effects from aviation using Gaussian processes, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4337, https://doi.org/10.5194/egusphere-egu23-4337, 2023.

EGU23-4454 | ECS | Posters on site | ITS1.13/AS5.2

Machine learning for data driven discovery of time transfer functions in numerical modelling: simulating catastrophic shifts in vegetation-soil systems

Oriol Pomarol Moya and Derek Karssenberg

Time transfer functions describe the change of state variables over time in geoscientific numerical simulation models. The identification of these functions is an essential but challenging step in model building. While traditional methods rely on qualitative understanding or first order principles, the availability of large spatio-temporal data sets from direct measurements or extremely detailed physical-based system modelling has enabled the use of machine learning methods to discover the time transfer function directly from data. In this study we explore the feasibility of this data driven approach for numerical simulation of the co-evolution of soil, hydrology, vegetation, and grazing on landscape scale, at geological timescales. From empirical observation and hyper resolution (1 m, 1 week) modelling (Karssenberg et al, 2017) it has been shown that a hillslope system shows complex behaviour with two stable states, respectively high biomass on deep soils (healthy state) and low biomass on thin soils (degraded or desertic state). A catastrophic shift from healthy to degraded state occurs under changes of external forcing (climate, grazing pressure), with a transient between states that is rapid or slow depending on system characteristics. To identify and use the time transfer functions of this system at hillslope scale we follow four procedural steps. First, an extremely large data set of hillslope average soil and vegetation state is generated by a mechanistic hyper resolution (1 m, 1 week) system model, forcing it with different variations in grazing pressure over time. Secondly, a machine learning model predicting the rate of change in soil and vegetation as function of soil, vegetation, and grazing pressure, is trained on this data set. In the third step, we explore the ability of this trained machine learning model to predict the rate of system change (soil and vegetation) on untrained data. Finally, in the fourth step, we use the trained machine learning model as time transfer function in a forward numerical simulation of a hillslope to determine whether it is capable of representing the known complex behaviour of the system. Our findings are that the approach is in principle feasible. We compared the use of a deep neural network and a random forest. Both can achieve great fitting precision, although the latter performs much faster and requires less training data. Even though the machine learning based time transfer function shows differences in the rates of change in system state from those calculated using expert knowledge in Karssenberg et al. (2017), forward simulation appeared to be possible with system behaviour generally in line with that observed in the data from the hyper resolution model. Our findings indicate that discovery of time transfer functions from data is possible. Next steps need to involve the use observational data (e.g., from remote sensing) to test the approach using data from real-world systems.

Karssenberg, D., Bierkens, M.F.P., Rietkerk, M., Catastrophic Shifts in Semiarid Vegetation-Soil Systems May Unfold Rapidly or Slowly. The American Naturalist 2017. Vol. 190, pp. E145–E155.

How to cite: Pomarol Moya, O. and Karssenberg, D.: Machine learning for data driven discovery of time transfer functions in numerical modelling: simulating catastrophic shifts in vegetation-soil systems, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4454, https://doi.org/10.5194/egusphere-egu23-4454, 2023.

EGU23-4695 | Posters on site | ITS1.13/AS5.2

Development of PBL Parameterization Emulator using Neural Networks

Jiyeon Jang, Tae-Jin Oh, Sojung An, Wooyeon Park, Inchae Na, and Junghan Kim

Physical parameterization is one of the major components of Numerical Weather Prediction system. In Korean Integrated Model (KIM), physical parameterizations account for about 30 % of the total computation time. There are many studies of developing neural network based emulators to replace and accelerate physics based parameterization. In this study, we develop a planetary boundary layer(PBL) emulator which is based on Shin-Hong (Hong et al., 2006, 2010; Shin and Hong, 2013, 2015) scheme that computes the parameterized effects of vertical turbulent eddy diffusion of momentum, water vapor, and sensible heat fluxes. We compare the emulator performance with Multi-Layer Perceptron (MLP) based architectures: simple MLP, MLP application version, and MLP-mixer(Tolstikhin et al., 2021). MLP application version divides data into several vertical groups for better approximation of each vertical group layers. MLP-mixer is MLP based architecture that performs well in computer vision without using convolution and self-attention. We evaluate the resulting MLP based emulator performance. MLP application version and MLP-mixer showed significant performance improvement over simple MLP.

How to cite: Jang, J., Oh, T.-J., An, S., Park, W., Na, I., and Kim, J.: Development of PBL Parameterization Emulator using Neural Networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4695, https://doi.org/10.5194/egusphere-egu23-4695, 2023.

EGU23-4817 | ECS | Posters on site | ITS1.13/AS5.2

Algorithmic optimisation of key parameters of OpenIFS

Lauri Tuppi, Madeleine Ekblom, Pirkka Ollinaho, and Heikki Järvinen

Numerical weather prediction models contain parameters that are inherently uncertain and cannot be determined exactly. Traditionally, the parameter tuning has been done manually, which can be an extremely labourious task. Tuning the entire model usually requires adjusting a relatively large amount of parameters. In case of manual tuning, the need to balance a number of requirements at the same time can lead the tuning process being a maze of subjective choices. It is, therefore, desirable to have reliable objective approaches for estimation of optimal values and uncertainties of these parameters. In this presentation we present how to optimise 20 key physical parameters having a strong impact on forecast quality. These parameters belong to the Stochastically Perturbed Parameters Scheme in the atmospheric model Open Integrated Forecasting System.

The results show that simultaneous optimisation of O(20) parameters is possible with O(100) algorithm steps using an ensemble of O(20) members, and that the optimised parameters lead to substantial enhancement of predictive skill. The enhanced predictive skill can be attributed to reduced biases in low-level winds and upper-tropospheric humidity in the optimised model. We find that the optimisation process is dependent on the starting values of the parameters that are optimised (starting from better suited values results in a better model). The results also show that the applicability of the tuned parameter values across different model resolutions is somewhat questionable since the model biases seem to be resolution-specific. Moreover, our optimisation algorithm tends to treat the parameter covariances poorly limiting its ability to converge to the global optimum.

How to cite: Tuppi, L., Ekblom, M., Ollinaho, P., and Järvinen, H.: Algorithmic optimisation of key parameters of OpenIFS, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4817, https://doi.org/10.5194/egusphere-egu23-4817, 2023.

EGU23-5003 | ECS | Posters on site | ITS1.13/AS5.2

Towards machine-learning calibration of cloud parameters in the kilometre-resolution ICON atmosphere model

Hannah Marie Eichholz, Jan Kretzschmar, Duncan Watson-Parris, Josefine Umlauft, and Johannes Quaas

EGU23-5149 | ECS | Posters on site | ITS1.13/AS5.2

Machine Learning Parameterization for Super-droplet Cloud Microphysics Scheme

Shivani Sharma and David Greenberg

Machine learning approaches have been widely used for improving the representation of subgrid scale parameterizations in Earth System Models. In our study we target the Cloud Microphysics parameterization, in particular the two-moment bulk scheme of the ICON (Icosahedral Non-hydrostatic) Model.

Cloud microphysics parameterization schemes suffer from an accuracy/speed tradeoff. The simplest schemes, often heavy with assumptions (such as the bulk moment schemes) are most common in operational weather prediction models. Conversely, the more complex schemes with fewer assumptions –e.g. Lagrangian schemes such as the super-droplet method (SDM)– are computationally expensive and used only within research and development. SDM allows easy representation of complex scenarios with multiple hydrometeors and can also be used for simulating cloud-aerosol interactions. To bridge this gap and to make the use of more complex microphysical schemes feasible within operational models, we use a data-driven approach.

Here we train a neural network to mimic the behavior of SDM simulations in a warm-rain scenario in a dimensionless control volume. The network behaves like a dynamical system that converts cloud droplets to rain droplets–represented as bulk moments–with only the current system state as the input. We use a multi-step training loss to stabilize the network over long integration periods, especially in cases with extremely low cloud water to start with. We find that the network is stable across various initial conditions and in many cases, emulates the SDM simulations better than the traditional bulk moment schemes. Our network also performs better than any previous ML-based attempts to learn from SDM. This opens the possibility of using the trained network as a proxy for imitating the computationally expensive SDM within operational weather prediction models with minimum computational overhead.

How to cite: Sharma, S. and Greenberg, D.: Machine Learning Parameterization for Super-droplet Cloud Microphysics Scheme, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5149, https://doi.org/10.5194/egusphere-egu23-5149, 2023.

EGU23-5523 | ECS | Orals | ITS1.13/AS5.2

Using weak constrained neural networks to improve simulations in the gray zone

Yvonne Ruckstuhl, Raphael Kriegmair, Stephan Rasp, and George Craig

Machine learning represents a potential method to cope with the gray zone problem of representing motions in dynamical systems on scales comparable to the model resolution. Here we explore the possibility of using a neural network to directly learn the error caused by unresolved scales. We use a modified shallow water model which includes highly nonlinear processes mimicking atmospheric convection. To create the training dataset, we run the model in a high- and a low-resolution setup and compare the difference after one low-resolution time step, starting from the same initial conditions, thereby obtaining an exact target. The neural network is able to learn a large portion of the difference when evaluated on single time step predictions on a validation dataset. When coupled to the low-resolution model, we find large forecast improvements up to 1 d on average. After this, the accumulated error due to the mass conservation violation of the neural network starts to dominate and deteriorates the forecast. This deterioration can effectively be delayed by adding a penalty term to the loss function used to train the ANN to conserve mass in a weak sense. This study reinforces the need to include physical constraints in neural network parameterizations.

How to cite: Ruckstuhl, Y., Kriegmair, R., Rasp, S., and Craig, G.: Using weak constrained neural networks to improve simulations in the gray zone, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5523, https://doi.org/10.5194/egusphere-egu23-5523, 2023.

EGU23-5766 | ECS | Orals | ITS1.13/AS5.2

Best Practices for Fortran-Python Bridges to Integrate Neural Networks in Earth System Models

Caroline Arnold, Shivani Sharma, Tobias Weigel, and David Greenberg

In recent years, machine learning (ML) based parameterizations have become increasingly common in Earth System Models (ESM). Sub-grid scale physical processes that would be computationally too expensive, e.g., atmospheric chemistry and cloud microphysics, can be emulated by ML algorithms such as neural networks.

Neural networks are trained first on simulations of the sub-grid scale process that is to be emulated. They are then used in so-called inference mode to make predictions during the ESM run, replacing the original parameterization. Training usually requires GPUs, while inference may be done on CPU architectures.

At first, neural networks are evaluated offline, i.e., independently of the ESM on appropriate datasets. However, their performance can ultimately only be evaluated in an online setting, where the ML algorithm is coupled to the ESM, including nonlinear interactions.

We want to shorten the time spent in neural network development and offline testing and move quickly to online evaluation of ML components in our ESM of choice, ICON (Icosahedral Nonhydrostatic Weather and Climate Model). Since ICON is written in Fortran, and modern ML algorithms are developed in the Python ecosystem, this requires efficient bridges between the two programming languages. The Fortran-Python bridge must be flexible to allow for iterative development of the neural network. Changes to the ESM codebase should be as few as possible, and the runtime overhead should not limit development.

In our contribution we explore three strategies to call the neural network inference from within Fortran using (i) embedded Python code compiled in a dynamic library, (ii) pipes, and (iii) MPI using the ICON coupler YAC. We provide quantitative benchmarks for the proposed Fortran-Python bridges and assess their overall suitability in a qualitative way to derive best practices. The Fortran-Python bridge enables scientists and developers to evaluate ML components in an online setting, and can be extended to other parameterizations and ESMs.

How to cite: Arnold, C., Sharma, S., Weigel, T., and Greenberg, D.: Best Practices for Fortran-Python Bridges to Integrate Neural Networks in Earth System Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5766, https://doi.org/10.5194/egusphere-egu23-5766, 2023.

EGU23-6287 | Orals | ITS1.13/AS5.2

Approximation and Optimization of Atmospheric Simulations in High Spatio-Temporal Resolution with Neural Networks

Elnaz Azmi, Jörg Meyer, Marcus Strobl, Michael Weimer, and Achim Streit

Accurate forecasts of the atmosphere demand large-scale simulations with high spatio-temporal resolution. Atmospheric chemistry modeling, for example, usually requires solving a system of hundreds of coupled ordinary partial differential equations. Due to the computational complexity, large high performance computing resources are required, which is a challenge as the spatio-temporal resolution increases. Machine learning methods and specially deep learning can offer an approximation of the simulations with some factor of speed-up while using less compute resources. The goal of this study is to investigate the feasibility, opportunities but also challenges and pitfalls of replacing the compute-intensive chemistry of a state-of-the-art atmospheric chemistry model with a trained neural network model to forecast the concentration of trace gases at each grid cell and to reduce the computational complexity of the simulation. In this work, we introduce a neural network model (ICONET) to forecast trace gas concentrations without executing the traditional compute-intensive atmospheric simulations. ICONET is equipped with a multifeature Long Short Term Memory (LSTM) model to forecast atmospheric chemicals iteratively in time. We generated the training and test dataset, our ground truth for ICONET, by execution of an atmospheric chemistry simulation in ICON-ART. Applying the ICONET trained model to forecast a test dataset results in a good fit of the forecast values compared to our ground truth dataset. We discuss appropriate metrics to evaluate the quality of models and present the quality of the ICONET forecasts with RMSE and KGE metrics. The variety in the nature of trace gases limits the model's learning and forecast skills according to the variable. In addition to the quality of the ICONET forecasts, we described the computational efficiency of ICONET as its run time speed-up in comparison to the run time of the ICON-ART simulation. The ICONET forecast showed a speed-up factor of 3.1 over the run time of the atmospheric chemistry simulation of ICON-ART, which is a significant achievement, especially when considering the importance of ensemble simulations.

How to cite: Azmi, E., Meyer, J., Strobl, M., Weimer, M., and Streit, A.: Approximation and Optimization of Atmospheric Simulations in High Spatio-Temporal Resolution with Neural Networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6287, https://doi.org/10.5194/egusphere-egu23-6287, 2023.

EGU23-6836 | ECS | Posters on site | ITS1.13/AS5.2

Parameterising melt at the base of Antarctic ice shelves with a feedforward neural network

Clara Burgard, Nicolas C. Jourdain, Pierre Mathiot, and Robin Smith

One of the largest sources of uncertainty when projecting the Antarctic contribution to sea-level rise is the ocean-induced melt at the base of Antarctic ice shelves. This is because resolving the ocean circulation and the ice-ocean interactions occurring in the cavity below the ice shelves is computationally expensive.

Instead, for large ensembles and long-term projections of the ice-sheet evolution, ice-sheet models currently rely on parameterisations to link the ocean temperature and salinity in front of ice shelves to the melt at their base. However, current physics-based parameterisations struggle to accurately simulate basal melt patterns.

As an alternative approach, we explore the potential use of a deep feedforward neural network as a basal melt parameterisation. To do so, we train a neural network to emulate basal melt rates simulated by highly-resolved circum-Antarctic ocean simulations. We explore the influence of different input variables and show that the neural network struggles to generalise to ice-shelf geometries unseen during training, while it generalises better on timesteps unseen during training. We also test the parameterisation on separate coupled ocean-ice simulations to assess the neural network’s performance on independent data.

How to cite: Burgard, C., Jourdain, N. C., Mathiot, P., and Smith, R.: Parameterising melt at the base of Antarctic ice shelves with a feedforward neural network, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6836, https://doi.org/10.5194/egusphere-egu23-6836, 2023.

EGU23-7281 | ECS | Posters on site | ITS1.13/AS5.2

Neural network surrogate models for multiple scattering: Application to OMPS LP simulations

Michael Himes, Natalya Kramarova, Tong Zhu, Jungbin Mok, Matthew Bandel, Zachary Fasnacht, and Robert Loughman

Retrieving ozone from limb measurements necessitates the modeling of scattered light through the atmosphere. However, accurately modeling multiple scattering (MS) during retrieval requires excessive computational resources; consequently, operational retrieval models employ approximations in lieu of the full MS calculation. Here we consider an alternative MS approximation method, where we use radiative transfer (RT) simulations to train neural network models to predict the MS radiances. We present our findings regarding the best-performing network hyperparameters, normalization schemes, and input/output data structures. Using RT calculations based on measurements by the Ozone Mapping and Profiling Suite's Limb Profiler (OMPS/LP), we compare the accuracy of these neural-network models with both the full MS calculation as well as the current MS approximation methods utilized during OMPS/LP retrievals.

How to cite: Himes, M., Kramarova, N., Zhu, T., Mok, J., Bandel, M., Fasnacht, Z., and Loughman, R.: Neural network surrogate models for multiple scattering: Application to OMPS LP simulations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7281, https://doi.org/10.5194/egusphere-egu23-7281, 2023.

EGU23-7368 | ECS | Posters on site | ITS1.13/AS5.2

Comparison of Methods for Learning Differential Equations from Data

Christof Schötz

Some results from the DEEB (Differential Equation Estimation Benchmark) are presented. In DEEB, we compare different machine learning approaches and statistical methods for estimating nonlinear dynamics from data. Such methods constitute an important building block for purely data-driven earth system models as well as hybrid models which combine physical knowledge with past observations.

Specifically, we examine approaches for solving the following problem: Given time-state-observations of a deterministic ordinary differential equation (ODE) with measurement noise in the state, predict the future evolution of the system. Of particular interest are systems with chaotic behavior - like Lorenz 63 - and nonparametric settings, in which the functional form of the ODE is completely unknown (in particular, not restricted to a polynomial of low order). To create a fair comparison of methods, a benchmark database was created which includes datasets of simulated observations from different dynamical systems with different complexity and varying noise levels. The list of methods we compare includes: echo state networks, Gaussian processes, Neural ODEs, SINDy, thin plate splines, and more.

Although some methods consistently perform better than others throughout different datasets, there seems to be no silver bullet.

How to cite: Schötz, C.: Comparison of Methods for Learning Differential Equations from Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7368, https://doi.org/10.5194/egusphere-egu23-7368, 2023.

EGU23-7391 | ECS | Posters on site | ITS1.13/AS5.2

Learning fluid dynamical statistics using stochastic neural networks

Martin Brolly

Many practical problems in fluid dynamics demand an empirical approach, where statistics estimated from data inform understanding and modelling. In this context data-driven probabilistic modelling offers an elegant alternative to ad hoc estimation procedures. Probabilistic models are useful as emulators, but also offer an attractive means of estimating particular statistics of interest. In this paradigm one can rely on proper scoring rules for model comparison and validation, and invoke Bayesian statistics to obtain rigorous uncertainty quantification. Stochastic neural networks provide a particularly rich class of probabilistic models, which, when paired with modern optimisation algorithms and GPUs, can be remarkably efficient. We demonstrate this approach by learning the single particle transition density of ocean surface drifters from decades of Global Drifter Program observations using a Bayesian mixture density network. From this we derive maps of various displacement statistics and corresponding uncertainty maps. Our model also offers a means of simulating drifter trajectories as a discrete-time Markov process, which could be used to study the transport of plankton or plastic in the upper ocean.

How to cite: Brolly, M.: Learning fluid dynamical statistics using stochastic neural networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7391, https://doi.org/10.5194/egusphere-egu23-7391, 2023.

EGU23-7492 | Posters on site | ITS1.13/AS5.2

Machine Learning and Microseism as a Tool for Sea Wave Monitoring

Flavio Cannavo', Vittorio Minio, Susanna Saitta, Salvatore Alparone, Alfio Marco Borzì, Andrea Cannata, Giuseppe Ciraolo, Danilo Contrafatto, Sebastiano D’Amico, Giuseppe Di Grazia, and Graziano Larocca

Monitoring the state of the sea is a fundamental task for economic activities in the coastal zone, such as transport, tourism and infrastructure design. In recent years, regular wave height monitoring for marine risk assessment and mitigation has become unavoidable as global warming impacts in more intense and frequent swells.
In particular, the Mediterranean Sea has been considered as one of the most responsive regions to global warming, which may promote the intensification of hazardous natural phenomena as strong winds, heavy precipitation and high sea waves. Because of the high density population along the Mediterranean coastlines, heavy swells could have major socio-economic consequences. To reduce the impacts of such scenarios, the development of more advanced monitoring systems of the sea state becomes necessary.
In the last decade, it has been demonstrated how seismometers can be used to measure sea conditions by exploiting the characteristics of a part of the seismic signal called microseism. Microseism is the continuous seismic signal recorded in the frequency band of 0.05 and 0.4 Hz that is likely generated by interactions of sea waves together and with seafloor or shorelines.
In this work, in the framework of i-WaveNET INTERREG project, we performed a regression analysis to develop a model capable of predicting the sea state in the Sicily Channel (Italy) using microseism, acquired by onshore instruments installed in Sicily and Malta. Considering the complexity of the relationship between spatial sea wave height data and seismic data measured at individual stations, we used supervised machine learning (ML) techniques to develop the prediction model. As input data we used the hourly Root Mean Squared (RMS) amplitude of the seismic signal recorded by 14 broadband stations, along the three components, and in different frequency bands, during 2018 - 2021. These stations, belonging to the permanent seismic networks managed by the National Institute of Geophysics and Volcanology INGV and the Department of Geosciences of the University of Malta, consist of three-component broadband seismometers that record at a sampling frequency of 100 Hz.
As for the target, the significant sea wave height data from Copernicus Marine Environment Monitoring Service (CMEMS) for the same period were used. Such data is the hindcast product of the Mediterranean Sea Waves forecasting system, with hourly temporal resolution and 1/24° spatial resolution. After a feature selection step, we compared three different kinds of ML algorithms for regression: K-Nearest-Neighbors (KNN), Random Forest (RF) and Light Gradient Boosting (LGB). The hyperparameters were tuned by using a grid-search algorithm, and the best models were selected by cross-validation. Different metrics, such as MAE, R² and RMSE, were considered to evaluate the generalization capabilities of the models and special attention was paid to evaluate the predictive ability of the models for extreme wave height values.
Results show model predictive capabilities good enough to develop a sea monitoring system to complement the systems currently in use.

How to cite: Cannavo', F., Minio, V., Saitta, S., Alparone, S., Borzì, A. M., Cannata, A., Ciraolo, G., Contrafatto, D., D’Amico, S., Di Grazia, G., and Larocca, G.: Machine Learning and Microseism as a Tool for Sea Wave Monitoring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7492, https://doi.org/10.5194/egusphere-egu23-7492, 2023.

EGU23-7561 | ECS | Posters on site | ITS1.13/AS5.2

Deep Learning guided statistical downscaling of climate projections for use in hydrological impact modeling in Danish peatlands

Thea Quistgaard, Peter L. Langen, Tanja Denager, Raphael Schneider, and Simon Stisen

A course of action to combat the emission of greenhouse gasses (GHG) in a Danish context is to re-wet previously drained peatlands and thereby return them to their natural hydrological state acting as GHG sinks. GHG emissions from peatlands are known to be closely coupled to the hydrological dynamics through the groundwater table depth (WTD). To understand the effect of a changing and variable climate on the spatio-temporal dynamics of hydrological processes and the associated uncertainties, we aim to produce a high-resolution local-scale climate projection ensemble from the global-scale CMIP6 projections.

With focus on hydrological impacts, uncertainties and possible extreme endmembers, this study aims to span the full ensemble of local-scale climate projections in the Danish geographical area corresponding to the CMIP6-ensemble of Global Climate Models (GCMs). Deep learning founded statistical downscaling methods are applied bridge the gap from GCMs to local-scale climate change and variability, which in turn will be used in field-scale hydrological modeling. The approach is developed to specifically accommodate the resolutions, event types and conditions relevant for assessing the impacts on peatland GHG emissions through their relationship with WTD dynamics by applying stacked conditional generative adversarial networks (CGANs) to best downscale precipitation, temperature, and evaporation. In the future, the approach is anticipated to be extended to directly assess the impacts of climate change and ensemble uncertainty on peatland hydrology variability and extremes.

How to cite: Quistgaard, T., Langen, P. L., Denager, T., Schneider, R., and Stisen, S.: Deep Learning guided statistical downscaling of climate projections for use in hydrological impact modeling in Danish peatlands, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7561, https://doi.org/10.5194/egusphere-egu23-7561, 2023.

EGU23-8288 | Orals | ITS1.13/AS5.2

Learning operational altimetry mapping from ocean models

Quentin Febvre, Ronan Fablet, Julien Le Sommer, Clément Ubelmann, and Simon Benaïchouche

In oceanography, altimetry products are used to measure the height of the ocean surface, and ocean modeling is used to understand and predict the behavior of the ocean. There are two main types of gridded altimetry products: operational sea level products, such as DUACS, which are used for forecasting and reconstruction, and ocean model reanalyses, such as Glorys 12, which are used to forecast seasonal trends and assess physical characteristics. However, advances in ocean modeling do not always directly benefit operational forecast or reconstruction products.

In this study, we investigate the potential for deep learning methods, which have been successfully applied in simulated setups, to leverage ocean modeling efforts for improving operational altimetry products. Specifically, we ask under what conditions the knowledge learned from ocean simulations can be applied to real-world operational altimetry mapping. We consider the impact of simulation grid resolution, observation data reanalysis, and physical processes modeled on the performance of a deep learning model.

Our results show that the deep learning model outperforms current operational methods on a regional domain around the Gulfstream, with a 50km improvement in resolved scale. This improvement has the potential to enhance the accuracy of operational altimetry products, which are used for a range of important applications, such as climate monitoring and understanding mesoscale ocean dynamics.

How to cite: Febvre, Q., Fablet, R., Le Sommer, J., Ubelmann, C., and Benaïchouche, S.: Learning operational altimetry mapping from ocean models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8288, https://doi.org/10.5194/egusphere-egu23-8288, 2023.

EGU23-9285 | ECS | Orals | ITS1.13/AS5.2

Stabilized Neural Differential Equations for Hybrid Modeling with Conservation Laws

Alistair White and Niklas Boers

Neural Differential Equations (NDEs) provide a powerful framework for hybrid modeling. Unfortunately, the flexibility of the neural network component of the model comes at the expense of potentially violating known physical invariants, such as conservation laws, during inference. This shortcoming is especially critical for applications requiring long simulations, such as climate modeling, where significant deviations from the physical invariants can develop over time. It is hoped that enforcing physical invariants will help address two of the main barriers to adoption for hybrid models in climate modeling: (1) long-term numerical stability, and (2) generalization to out-of-sample conditions unseen during training, such as climate change scenarios. We introduce Stabilized Neural Differential Equations, which augment an NDE model with compensating terms that ensure physical invariants remain approximately satisfied during numerical simulations. We apply Stabilized NDEs to the double pendulum and Hénon–Heiles systems, both of which are conservative, chaotic dynamical systems possessing a time-independent Hamiltonian. We evaluate Stabilized NDEs using both short-term and long-term prediction tasks, analogous to weather and climate prediction, respectively. Stabilized NDEs perform at least as well as unstabilized models at the “weather prediction” task, that is, predicting the exact near-term state of the system given initial conditions. On the other hand, Stabilized NDEs significantly outperform unstabilized models at the “climate prediction” task, that is, predicting long-term statistical properties of the system. In particular, Stabilized NDEs conserve energy during long simulations and consequently reproduce the long-term dynamics of the target system with far higher accuracy than non-energy conserving models. Stabilized NDEs also remain numerically stable for significantly longer than unstabilized models. As well as providing a new and lightweight method for combining physical invariants with NDEs, our results highlight the relevance of enforcing conservation laws for the long-term numerical stability and physical accuracy of hybrid models.

How to cite: White, A. and Boers, N.: Stabilized Neural Differential Equations for Hybrid Modeling with Conservation Laws, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9285, https://doi.org/10.5194/egusphere-egu23-9285, 2023.

EGU23-10135 | ECS | Orals | ITS1.13/AS5.2

Exploring physics-informed machine learning for accelerated simulation of permafrost processes

Brian Groenke, Moritz Langer, Guillermo Gallego, and Julia Boike

Permafrost, i.e. ground material that remains perennially frozen, plays a key role in Arctic ecosystems. Monitoring the response of permafrost to rapid climate change remains difficult due to the sparse availability of long-term, high quality measurements of the subsurface. Numerical models are therefore an indispensable tool for understanding the evolution of Arctic permafrost. However, large scale simulation of the hydrothermal processes affecting permafrost is challenging due to the highly nonlinear effects of phase change in porous media. The resulting computational cost of such simulations is especially prohibitive for sensitivity analysis and parameter estimation tasks where a large number of simulations may be necessary for robust inference of quantities such as temperature, water fluxes, and soil properties. In this work, we explore the applicability of recently developed physics-informed machine learning (PIML) methods for accelerating numerical models of permafrost hydrothermal dynamics. We present a preliminary assessment of two possible applications of PIML in this context: (1) linearization of the nonlinear PDE system according to Koopman operator theory in order to reduce the computational burden of large scale simulations, and (2) efficient parameterization of the surface energy balance and snow dynamics on the subsurface hydrothermal regime. By combining the predictive power of machine learning with the underlying conservation laws, PIML can potentially enable researchers and practitioners interested in permafrost to explore complex process interactions at larger spatiotemporal scales.

How to cite: Groenke, B., Langer, M., Gallego, G., and Boike, J.: Exploring physics-informed machine learning for accelerated simulation of permafrost processes, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10135, https://doi.org/10.5194/egusphere-egu23-10135, 2023.

EGU23-10256 | ECS | Posters on site | ITS1.13/AS5.2

Foehn Wind Analysis using Unsupervised Deep Anomaly Detection

Tobias Milz, Marte Hofsteenge, Marwan Katurji, and Varvara Vetrova

Foehn winds are accelerated, warm and dry winds that can have significant environmental impacts as they descend into the lee of a mountain range. For example, in the McMurdo Dry Valleys in Antarctica, foehn events can cause ice and glacial melt and destabilise ice shelves, which if lost, resulting in a rise in sea level. Consequently, there is a strong interest in a deeper understanding of foehn winds and their meteorological signatures. Most current automatic detection methods rely on rule-based methodologies that require static thresholds of meteorological parameters. However, the patterns of foehn winds are hard to define and differ between alpine valleys around the world. Consequently, data-driven solutions might help create more accurate detection and prediction methodologies.

State-of-the-art machine learning approaches to this problem have shown promising results but follow a supervised learning paradigm. As such, these approaches require accurate labels, which for the most part, are being created by imprecise static rule-based algorithms. Consequently, the resulting machine-learning models are trained to recognise the same static definitions of the foehn wind signatures.

In this paper, we introduce and compare the first unsupervised machine-learning approaches for detecting foehn wind events. We focus on data from the Mc Murdo Dry Valleys as an example, however, due to the unsupervised nature of these approaches, our solutions can recognise a more dynamic definition of foehn wind events and are therefore, independent of the location. The first approach is based on multivariate time-series clustering, while the second utilises a deep autoencoder-based anomaly detection method to identify foehn wind events. Our best model achieves an f1-score of 88%, matching or surpassing previous machine-learning methods while providing a more flexible and inclusive definition of foehn events.

How to cite: Milz, T., Hofsteenge, M., Katurji, M., and Vetrova, V.: Foehn Wind Analysis using Unsupervised Deep Anomaly Detection, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10256, https://doi.org/10.5194/egusphere-egu23-10256, 2023.

EGU23-10351 | ECS | Orals | ITS1.13/AS5.2

Deep learning of systematic sea ice model errors from data assimilation increments

William Gregory, Mitchell Bushuk, Alistair Adcroft, and Yongfei Zhang

Data assimilation is often viewed as a framework for correcting short-term error growth in dynamical climate model forecasts. When viewed on the time scales of climate however, these short-term corrections, or analysis increments, closely mirror the systematic bias patterns of the dynamical model. In this work, we show that Convolutional Neural Networks (CNNs) can be used to learn a mapping from model state variables to analysis increments, thus promoting the feasibility of a data-driven model parameterization which predicts state-dependent model errors. We showcase this problem using an ice-ocean data assimilation system within the fully coupled Seamless system for Prediction and EArth system Research (SPEAR) model at the Geophysical Fluid Dynamics Laboratory (GFDL), which assimilates satellite observations of sea ice concentration. The CNN then takes inputs of data assimilation forecast states and tendencies, and makes predictions of the corresponding sea ice concentration increments. Specifically, the inputs are sea ice concentration, sea-surface temperature, ice velocities, ice thickness, net shortwave radiation, ice-surface skin temperature, and sea-surface salinity. We show that the CNN is able to make skilful predictions of the increments, particularly between December and February in both the Arctic and Antarctic, with average daily spatial pattern correlations of 0.72 and 0.79, respectively. Initial investigation of implementation of the CNN into the fully coupled SPEAR model shows that the CNN can reduce biases in retrospective seasonal sea ice forecasts by emulating a data assimilation system, further suggesting that systematic sea ice biases could be reduced in a free-running climate simulation.

How to cite: Gregory, W., Bushuk, M., Adcroft, A., and Zhang, Y.: Deep learning of systematic sea ice model errors from data assimilation increments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10351, https://doi.org/10.5194/egusphere-egu23-10351, 2023.

EGU23-10394 | ECS | Posters on site | ITS1.13/AS5.2

Mitigate forecast error in surface wind speed using an offline single-column model with optimal ground forcing

Jin Feng

Current numerical weather prediction models contain significant systematic errors, due in part to indeterminate ground forcing (GF). This study considers an optimal virtual GF (GF_o) derived by training observed and simulated datasets of 10-m wind speeds (WS₁₀) for summer and winter. The GF_o is added to an offline surface multilayer model (SMM) to revise predictions of WS₁₀ in China by the Weather Research and Forecasting model (WRF). This revision is a data-based optimization under physical constraints. It reduces WS₁₀ errors and offers wide applicability. The resulting model outperforms two purely physical forecasts (the original WRF forecast and the SMM with physical GF parameterized using urban, vegetation, and subgrid topography) and two purely data-based revisions (i.e., multilinear regression and multilayer perceptron). Compared with original WRF forecasting, using the GF_o scheme reduces the Root Mean Square Error (RMSE) in WS₁₀ across China by 25% in summer and 32% in winter. The frontal area index of GF_o indicates that it includes both the effects of indeterminate GF and other possible complex physical processes associated with WS₁₀.

How to cite: Feng, J.: Mitigate forecast error in surface wind speed using an offline single-column model with optimal ground forcing, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10394, https://doi.org/10.5194/egusphere-egu23-10394, 2023.

EGU23-10726 | Posters virtual | ITS1.13/AS5.2

A hybrid VMD-WT-InceptionTime model for multi-horizon short-term air temperature forecasting in Alaska

Jaakko Putkonen, M. Aymane Ahajjam, Timothy Pasch, and Robert Chance

The lack of ground level observation stations outside of settlements makes monitoring and forecasting local weather and permafrost challenging in the Arctic. Such predictive pieces of information are essential to help prepare for potentially hazardous weather conditions, especially during winter. In this study, we aim at enhancing predictive analytics in Alaska of permafrost and temperature by using a hybrid forecasting technique. In particular, we propose VMD-WT-InceptionTime model for short-term air temperature forecasting.

This proposed technique incorporates data preprocessing techniques and deep learning to enhance the accuracy of the next seven days air temperature forecasts. Initially, the Spearman correlation coefficient is utilized to examine the relationship between different inputs and the forecast target temperature. Following this, Variational Mode Decomposition (VMD) is used to decompose the most output-correlated input variables (i.e., temperature and relative humidity) to extract intrinsic and non-stationary time-frequency features from the original sequences. The Wavelet Transform (WT) is then employed to further extract intrinsic multi-resolution patterns from these decomposed input variables. Finally, a deep InceptionTime model is used for multi-step air temperature forecasting using these processed sequences. This forecasting technique was developed using an open dataset holding 20+ years of data from three locations in Alaska: North Slope, Alaska, Arctic National Wildlife Refuge, Alaska, and Diomede Island region, Bering Strait. Model performance has been rigorously evaluated of metrics including RMSE, MAPE and error.

Results highlight the effectiveness of the proposed hybrid model in providing more accurate short-term forecasts than several baselines (GBDT, SVR, ExtraTrees, RF, ARIMA, LSTM, GRU, and Transformer). More specifically, this technique reported RMSE and MAPE average increase rates amounting to 11.21% and 16.13% in North Slope, 30.01% and 34.97% in Arctic National Wildlife Refuge, and 16.39%, 23.46% in Diomede Island region. In addition, the proposed technique produces forecasts over all seven horizons with a maximum error of <1.5K, a minimum error of >-1.2K, and an average error lower than 0.18K for North Slope. For Arctic National Wildlife Refuge, a maximum error of <1K, a minimum error of >-0.9K, and an average of < 0.1K. While a maximum error of <0.9K, a minimum error of >-0.8K, and an average of <0.13K, for Diomede Island region. However, the worst performances achieved were errors of around 6K in the third horizon (i.e., 3rd day) for North Slope and the Arctic National Wildlife Refuge and the last horizon (i.e., 7th day) for the Diomede Islands region. Most of the worst performances of the proposed technique in all three locations can be attributed to having to produce forecasts of higher variations and wider temperature ranges than their averages.

Overall, this research highlights the potential of the decomposition techniques and deep learning to: 1) reveal and effectively learn the underlying cyclicity of air temperatures at varying resolutions that allows for accurate predictions without any knowledge of the governing physics, 2) produce accurate multi-step temperature forecasts in Arctic climates.

How to cite: Putkonen, J., Ahajjam, M. A., Pasch, T., and Chance, R.: A hybrid VMD-WT-InceptionTime model for multi-horizon short-term air temperature forecasting in Alaska, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10726, https://doi.org/10.5194/egusphere-egu23-10726, 2023.

EGU23-10810 | ECS | Orals | ITS1.13/AS5.2

Oceanfourcast: Emulating Ocean Models with Transformers for Adjoint-based Data Assimilation

Suyash Bire, Björn Lütjens, Dava Newman, and Chris Hill

Adjoints have become a staple of the oceanic and atmospheric numerical modeling community over the past couple of decades as they are useful for tuning of dynamical models, sensitivity analyses, and data assimilation. One such application is generation of reanalysis datasets, which provide an optimal record of our past weather, climate, and ocean. For example, the state-of-the-art ocean-ice renanalysis dataset, ECCO, is created by optimally combining a numerical ocean model with heterogeneous observations through a technique called data assimilation. Data assimilation in ECCO minimizes the distance between model and observations by calculating adjoints, i.e., gradients of the loss w.r.t. simulation forcing fields (wind and surface heat fluxes). The forcing fields are iteratively updated and the model is rerun until the loss is minimized to ensure that the numerical model does not drastically deviate from the observations. Calculating adjoints, however, either requires disproportionately high computational resources or rewriting the dynamical model code to be autodifferentiable.

Therefore, we ask if deep learning-based emulators can provide fast and accurate adjoints. Ocean data is smooth, high-dimensional, and has complex spatiotemporal correlations. Therefore, as an initial foray into ocean emulators, we leverage a combination of neural operators and transformers. Specifically, we have adapted the FourCastNet architecture, which has successfully emulated ERA5 weather data in seconds rather than hours, to emulate an idealized ocean simulation.

We generated a ground-truth dataset by simulating a double-gyre, an idealized representation of the North Atlantic Ocean, using MITgcm, a state-of-the-art dynamical model. The model was forced by zonal wind at the surface and relaxation to a meridional profile of temperature — warm/cold temperatures at low/high latitudes. This simulation produced turbulent western boundary currents embedded in the large-scale gyre circulation. We performed 4 additional simulations by modifying the magnitude of SST relaxation and wind forcing to introduce diversity in the dataset. From these simulations, we used 4 state variables (meridional and zonal surface velocities, pressure, and temperature) as well as the forcing fields (zonal wind velocity and relaxation SST profile) sampled in 10-day steps. The dataset was split into training, validation, and test datasets such that validation and test datasets were unseen during training. These datasets provide an ideal testbed for evaluating and comparing the performance of data-driven ocean emulators.

We used this data to train and evaluate Oceanfourcast. Our initial results in the following figure show that our model, Oceanfourcast, can successfully predict the streamfunction and pressure for a lead time of 1 month.

We are currently working on generating adjoints from Oceanfourcast. We expect the adjoint calculation to require significantly less compute time than that from a full-scale dynamical model like MITgcm. Our work shows a promising path towards deep-learning augmented data assimilation and uncertainty quantification.

How to cite: Bire, S., Lütjens, B., Newman, D., and Hill, C.: Oceanfourcast: Emulating Ocean Models with Transformers for Adjoint-based Data Assimilation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10810, https://doi.org/10.5194/egusphere-egu23-10810, 2023.

EGU23-10904 | ECS | Posters on site | ITS1.13/AS5.2

On the choice of turbulence eddy fluxes to learn from in data-driven methods

Feier Yan, Julian Mak, and Yan Wang

Recent works have demonstrated the viability of employing data-driven / machine learning
methods for the purposes of learning more about ocean turbulence, with applications to turbulence parameterisations in ocean general circulation models. Focusing on mesoscale geostrophic turbulence in the ocean context, works thus far have mostly focused on the choice of algorithms and testing of trained up models. Here we focus instead on the choice of eddy flux data to learn from. We argue that, for mesoscale geostrophic turbulence, it might be beneficial from a theoretical as well as practical point of view to learn from eddy fluxes with dynamically inert rotational fluxes removed (ideally in a gauge invariant fashion), instead of the divergence of the eddy fluxes as has been considered thus far. Outlooks for physically constrained and interpretable machine learning will be given in light of the results.

How to cite: Yan, F., Mak, J., and Wang, Y.: On the choice of turbulence eddy fluxes to learn from in data-driven methods, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10904, https://doi.org/10.5194/egusphere-egu23-10904, 2023.

EGU23-10959 | Orals | ITS1.13/AS5.2

Deep learning parameterization of small-scale vertical velocity variability for atmospheric models

Donifan Barahona, Katherine Breen, and Heike Kalesse-Los

Small-scale fluctuations in vertical wind velocity, unresolved by climate and weather forecast models play a particularly important role in determining vapor and tracer fluxes, turbulence and cloud formation. Fluctuations in vertical wind velocity are challenging to represent since they depend on orography, large scale circulation features, convection and wind shear. Parameterizations developed using data retrieved at specific locations typically lack generalization and may introduce error when applied on a wide range of different conditions. Retrievals of vertical wind velocity are also difficult and subject to large uncertainty. This work develops a new data-driven, neural network representation of subgrid scale variability in vertical wind velocity. Using a novel deep learning technique, the new parameterization merges data from high-resolution global cloud resolving model simulations with high frequency Radar and Lidar retrievals. Our method aims to reproduce observed statistics rather than fitting individual measurements. Hence it is resilient to experimental uncertainty and robust to generalization. The neural network parameterization can be driven by weather forecast and reanalysis products to make real time estimations. It is shown that the new parameterization generalizes well outside of the training data and reproduces much better the statistics of vertical wind velocity than purely data-driven models.

How to cite: Barahona, D., Breen, K., and Kalesse-Los, H.: Deep learning parameterization of small-scale vertical velocity variability for atmospheric models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10959, https://doi.org/10.5194/egusphere-egu23-10959, 2023.

EGU23-11293 | ECS | Posters on site | ITS1.13/AS5.2

National scale agricultural development dynamics under socio-political drivers in Saudi Arabia since 1990

Ting Li, Oliver López Valencia, Kasper Johansen, and Matthew McCabe

Driven in large part by policy initiatives designed to increase food security and realized via the construction of thousands of center-pivot irrigation fields since the 1970s, agriculture development in Saudi Arabia has undergone tremendous changes. However, little is known about the accurate number, acreage, and the changing dynamics of the fields. To bridge the knowledge gap between the political drivers and in-field response, we leveraged a hybrid machine learning framework by implementing Density-Based Spatial Clustering of Applications with Noise, Convolutional Neural Networks, and Spectral Clustering in a stepwise manner to delineate the center-pivot fields on a national scale in Saudi Arabia using historical Landsat imagery since 1990. The framework achieved producer's and user's accuracies larger than 83.7% and 90.2%, respectively, when assessed against 28,000 manually delineated fields collected from different regions and periods. We explored multi-decadal dynamics of the agricultural development in Saudi Arabia by quantifying the number, acreage, and size distribution of center-pivot fields, along with the first and last detection year of the fields since 1990. The agricultural development in Saudi Arabia experienced four stages, including an initialization stage before 1990, a contraction stage from 1990 to 2010, an expansion stage from 2010 to 2016, and an ongoing contraction stage since 2016. Most of the fields predated 1990, representing over 8,800 km² in that year, as a result of the policy initiatives to stimulate wheat production, promoting Saudi Arabia as the sixth largest exporter of wheat in the 1980s. A decreasing trend was observed from 1990 to 2010, with an average of 8,011 km² of fields detected during those two decades, which was a response to the policy initiative implemented to phase-out wheat after 1990. As a consequence of planting fodder crops to promote the dairy industry, the number and extent of fields increased rapidly from 2010 to 2015 and reached its peak in 2016, with 33,961 fields representing 9,400 km². Agricultural extent has seen a continuous decline since 2016 to a level lower than 1990 values in 2020. This decline has been related to sustainable policy initiatives implemented for the Saudi Vision 2030. There is some evidence of an uptick in 2021 — also observed in an ongoing analysis for 2022 — which might be in response to global influences, such as the COVID-19 pandemic and the more recent conflict in the Ukraine, which has disrupted the international supply of agricultural products. The results provide a historical account of agricultural activity throughout the Kingdom and provide a basis for informed decision-making on sustainable irrigation and agricultural practices, helping to better protect and manage the nation's threatened groundwater resources, and providing insights into the resilience and elasticity of the Saudi Arabian food system to global perturbations.

How to cite: Li, T., López Valencia, O., Johansen, K., and McCabe, M.: National scale agricultural development dynamics under socio-political drivers in Saudi Arabia since 1990, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11293, https://doi.org/10.5194/egusphere-egu23-11293, 2023.

EGU23-11687 | ECS | Orals | ITS1.13/AS5.2

Objectively Determining the Number of Similar Hydrographic Clusters with Unsupervised Machine Learning

Carola Trahms, Yannick Wölker, and Arne Biastoch

Determining the number of existing water masses and defining their boundaries is subject to ongoing discussion in physical oceanography. Traditionally, water masses are defined manually by experts setting constraints based on experience and previous knowledge about the hydrographic properties describing them. In recent years, clustering, an unsupervised machine learning approach, has been introduced as a tool to determine clusters, i.e., volumes, with similar hydrographic properties without explicitly defining their hydrographic constraints. However, the exact number of clusters to be looked for is set manually by an expert up until now.

We propose a method that determines a fitting number of clusters for hydrographic clusters in a data driven way. In a first step, the method averages the data in different-sized slices along the time or depth axis as the structure of the hydrographic space changes strongly either in time or depth. Then the method applies clustering algorithms on the averaged data and calculates off-the-shelf evaluation scores (Davies-Bouldin, Calinski-Harabasz, Silhouette Coefficient) for several predefined numbers of clusters. In the last step, the optimal number of clusters is determined by analyzing the cluster evaluation scores across different numbers of clusters for optima or relevant changes in trend.

For validation we applied this method to the output for the subpolar North Atlantic between 1993 and 1997 of the high-resolution Atlantic Ocean model VIKING20X, in direct exchange with domain experts to discuss the resulting clusters. Due to the change from strong to weak deep convection in these years, the hydrographic properties vary strongly in the time and depth dimension, providing a specific challenge to our methodology.

Our findings suggest that it is possible to identify an optimal number of clusters using the off-the-shelf cluster evaluation scores that catch the underlying structure of the hydrographic space. The optimal number of clusters identified by our data-driven method agrees with the optimal number of clusters found by expert interviews. These findings contribute to aiding and objectifying water mass definitions across multiple expert decisions, and demonstrate the benefit of introducing data science methods to analyses in physical oceanography.

How to cite: Trahms, C., Wölker, Y., and Biastoch, A.: Objectively Determining the Number of Similar Hydrographic Clusters with Unsupervised Machine Learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11687, https://doi.org/10.5194/egusphere-egu23-11687, 2023.

EGU23-11906 | ECS | Orals | ITS1.13/AS5.2

Untapping the potential of geostationary EO data to understand drought impacts with XAI

Basil Kraft, Gregory Duveiller, Markus Reichstein, and Martin Jung

Ecosystems are affected by extreme climate conditions such as droughts worldwide but we still lack understanding of the involved dynamics. Which factors render an ecosystem more resilient, and on which temporal scales do weather patterns affect vegetation state and physiology? Traditional approaches to tackle such questions involve assumption-based land surface modeling or inversions. Machine learning (ML) methods can provide a complementary perspective on how ecosystems respond to climate in a more data-driven and assumption-free manner. However, ML depends heavily on data, and commonly used observations of vegetation at best contain one observation per day, but most products are provided at 16-daily to monthly temporal resolution. This masks important processes at sub-monthly time scales. In addition, ML models are inherently difficult to interpret, which still limits their applicability for process understanding.

In the present study, we combine modern deep learning models in the time domain with observations from the geostationary Meteosat Second Generation (MSG) satellite, centered over Africa. We model fractional vegetation cover (representing vegetation state) and land surface temperature (as a proxy for water stress) from MSG as a function of meteorology and static geofactors. MSG collects observations at sub-daily frequency, rendering it into an excellent tool to study short- to mid-term land surface processes. Furthermore, we use methods from explainable ML for post-hoc model interpretation to identify meteorological drivers of vegetation dynamics and their interaction with key geofactors.

From the analysis, we expect to gather novel insights into ecosystem response to droughts with high temporal fidelity. Drought response of vegetation can be highly diverse and complex especially in arid to semi-arid regions prevalent in Africa. Also, we assess the potential of explainable machine learning to discover new linkages and knowledge and discuss potential pitfalls of the approach. Explainable machine learning, combined with potent deep learning approaches and modern Earth observation products offers the opportunity to complement assumption-based modeling to predict and understand ecosystem response to extreme climate.

How to cite: Kraft, B., Duveiller, G., Reichstein, M., and Jung, M.: Untapping the potential of geostationary EO data to understand drought impacts with XAI, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11906, https://doi.org/10.5194/egusphere-egu23-11906, 2023.

EGU23-11958 | ECS | Posters on site | ITS1.13/AS5.2

Modelling Soil Temperature and Soil Moisture in Space, Depth, and Time with Machine Learning Techniques

Maiken Baumberger, Linda Adorf, Bettina Haas, Nele Meyer, and Hanna Meyer

Soil temperature and soil moisture variations have large effects on ecological processes in the soil. To investigate and understand these processes, high-resolution data of soil temperature and soil moisture are required. Here, we present an approach to generate data of soil temperature and soil moisture continuously in space, depth, and time for a 400 km² study area in the Fichtel Mountains (Germany). As reference data, measurements with 1 m long soil probes were taken. To cover many different locations, the available 15 soil probes were shifted regularly in the course of one year. With this approach, around 250 different locations in forest sites, on meadows and on agricultural fields were captured under a variety of meteorological conditions. These measurements are combined with readily available meteorological data, satellite data and soil maps in a machine learning approach to learn the complex relations between these variables. We aim for a model which can predict the soil temperature and soil moisture continuously for our study area in the Fichtel Mountains, with a spatial resolution of 10 m x 10 m, down to 1 m depth with segments of 10 cm each and in an hourly resolution in time. Here, we present the results of our pilot study where we focus on the temperature and moisture change within the depth down to 1 m at one single location. To take temporal lags into account, we construct a Long Short-Term Memory network based on meteorological data as predictors to make temperature and moisture predictions in time and depth. The results indicate a high ability of the model to reproduce the time series of the single location and highlight the potential of the approach for the space-time-depth mapping of soil temperature and soil moisture.

How to cite: Baumberger, M., Adorf, L., Haas, B., Meyer, N., and Meyer, H.: Modelling Soil Temperature and Soil Moisture in Space, Depth, and Time with Machine Learning Techniques, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11958, https://doi.org/10.5194/egusphere-egu23-11958, 2023.

EGU23-12218 | Posters on site | ITS1.13/AS5.2

Bias correction of aircraft temperature observations in the Korean Integrated Model based on a deep learning approach

Hui-nae Kwon, Hyeon-ju Jeon, Jeon-ho Kang, In-hyuk Kwon, and Seon Ki Park

The aircraft-based observation is one of the important anchor data used in the numerical weather prediction (NWP) models. Nevertheless, the bias has been noted in the temperature observation through several previous studies. As the performance on the hybrid four-dimensional ensemble variational (hybrid-4DEnVar) data assimilation (DA) system of the Korean Integrated Model (KIM) ⸺ the operational model in the Korea Meteorological Administration (KMA) ⸺ has been advanced, the need for the aircraft temperature bias correction (BC) has been confirmed. Accordingly, as a preliminary study on the BC, the static BC method based on the linear regression was applied to the KIM Package for Observation Processing (KPOP) system. However, the results showed there were limitations of a spatial discontinuity and a dependency on the calculation period of BC coefficients.

In this study, we tried to develop the machine learning-based bias estimation model to overcome these limitations. The MultiLayer Perceptron (MLP) based learning was performed to consider the vertical, spatial and temporal characteristics of each observation by flight IDs and phases, and at the same time to consider the correlation among observation variables. As a result of removing the predicted bias from the bias estimation model, the mean of the background innovation (O-B) decreases from 0.2217 K to 0.0136 K in a given test period. Afterwards, in order to verify the analysis field impact for BC, the bias estimation model will be grafted onto the KPOP system and then several DA cycle experiments will be conducted in the KIM.

How to cite: Kwon, H., Jeon, H., Kang, J., Kwon, I., and Park, S. K.: Bias correction of aircraft temperature observations in the Korean Integrated Model based on a deep learning approach, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12218, https://doi.org/10.5194/egusphere-egu23-12218, 2023.

EGU23-12355 | ECS | Orals | ITS1.13/AS5.2

Comparison of NWP Models Used in Training Surrogate Wave Models

Ajit Pillai, Ian Ashton, Jiaxin Chen, and Edward Steele

Machine learning is increasingly being applied to ocean wave modelling. Surrogate modelling has the potential to reduce or bypass the large computational requirements, creating a low computational-cost model that offers a high level of accuracy. One approach integrates in-situ measurements and historical model runs to achieve the spatial coverage of the model and the accuracy of the in-situ measurements. Once operational, such a system requires very little computational power, meaning that it could be deployed to a mobile phone, operational vessel, or autonomous vessel to give continuous data. As such, it makes a significant change to the availability of met-ocean data with potential to revolutionise data provision and use in marine and coastal settings.

This presentation explores the impact that an underlying physics-based model can have in such a machine learning driven framework; comparing training the system on a bespoke regional SWAN wave model developed for wave energy developments in the South West of the UK against training using the larger North-West European Shelf long term hindcast wave model run by the UK Met Office. The presentation discusses the differences in the underlying NWP models, and the impacts that these have on the surrogate wave models’ accuracy in both nowcasting and forecasting wave conditions at areas of interest for renewable energy developments. The results identify the importance in having a high quality, validated, NWP model for training such a system and the way in which the machine learning methods can propagate and exaggerate the underlying model uncertainties.

How to cite: Pillai, A., Ashton, I., Chen, J., and Steele, E.: Comparison of NWP Models Used in Training Surrogate Wave Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12355, https://doi.org/10.5194/egusphere-egu23-12355, 2023.

EGU23-12403 | ECS | Orals | ITS1.13/AS5.2

PseudoSpectralNet: A hybrid neural differential equation for atmosphere models

Maximilian Gelbrecht and Niklas Boers

When predicting complex systems such as parts of the Earth system, one typically relies on differential equations which often can be incomplete, missing unknown influences or include errors through their discretization. To remedy those effects, we present PseudoSpectralNet (PSN): a hybrid model that incorporates both a knowledge-based part of an atmosphere model and a data-driven part, an artificial neural network (ANN). PSN is a neural differential equation (NDE): it defines the right-hand side of a differential equation, combining a physical model with ANNs and is able to train its parameters inside this NDE. Similar to the approach of many atmosphere models, part of the model is computed in the spherical harmonics domain, and other parts in the grid domain. The model consists of ANN layers in each domain, information about derivatives, and parameters such as the orography. We demonstrate the capabilities of PSN on the well-studied Marshall Molteni Quasigeostrophic Model.

How to cite: Gelbrecht, M. and Boers, N.: PseudoSpectralNet: A hybrid neural differential equation for atmosphere models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12403, https://doi.org/10.5194/egusphere-egu23-12403, 2023.

EGU23-12458 | ECS | Posters on site | ITS1.13/AS5.2

Training Deep Data Assimilation Networks on Sparse and Noisy Observations

Vadim Zinchenko and David Greenberg

Data Assimilation (DA) is a challenging and expensive computational problem targetting hidden variables in high-dimensional spaces. 4DVar methods are widely used in weather forecasting to fit simulations to sparse observations by optimization over numerical model input. The complexity of this inverse problem and the sequential nature of common 4DVar approaches lead to long computation times with limited opportunity for parallelization. Here we propose using machine learning (ML) algorithms to replace the entire 4DVar optimization problem with a single forward pass through a neural network that maps from noisy and incomplete observations at multiple time points to a complete system state estimate at a single time point. We train the neural network using a loss function derived from the weak-constraint 4DVar objective, including terms incorporating errors in both model and data. In contrast to standard 4DVar approaches, our method amortizes the computational investment of training to avoid solving optimization problems for each assimilation window, and its non-sequential nature allows for easy parallelization along the time axis for both training and inference. In contrast to most previous ML-based data assimilation methods, our approach does not require access to complete, noise-free simulations for supervised learning or gradient-free approximations such as Ensemble Kalman filtering. To demonstrate the potential of our approach, we show a proof-of-concept on the chaotic Lorenz'96 system, using a novel "1.5D Unet" architecture combining 1D and 2D convolutions.

How to cite: Zinchenko, V. and Greenberg, D.: Training Deep Data Assimilation Networks on Sparse and Noisy Observations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12458, https://doi.org/10.5194/egusphere-egu23-12458, 2023.

EGU23-12566 | Posters on site | ITS1.13/AS5.2

Comparison of PM2.5 concentrations prediction model performance using Artificial Intelligence

Kyung-Hui Wang, Chae-Yeon Lee, Ju-Yong Lee, Min-Woo Jung, Dong-Geon Kim, Seung-Hee Han, Dae-Ryun Choi, and Hui-young Yun

Since PM_2.5 (particulate matter with an aerodynamic diameter of less than 2.5 µm) directly threatens public health, in order to take appropriate measures(prevention) in advance, the Korea Ministry of Environment(MOE) has been implementing PM₁₀ forecast nationwide since February 2014. PM_2.5 forecasts have been implemented nationwide since January 2015. The currently implemented PM forecast by the MOE subdivides the country into 19 regions, and forecasts the level of PM in 4 stages of “Good”, “Moderate”, “Unhealthy”, and “Very unhealthy”.

Currently PM air quality forecasting system operated by the MOE is based on a numerical forecast model along with a weather and emission model. Numerical forecasting model has fundamental limitations such as the uncertainty of input data such as emissions and meteorological data, and the numerical model itself. Recently, many studies on predicting PM using artificial intelligence such as DNN, RNN, LSTM, and CNN have been conducted to overcome the limitations of numerical models.

In this study, in order to improve the prediction performance of the numerical model, past observational data (air quality and meteorological data) and numerical forecasting model data (chemical transport model) are used as input data. The machine learning model consists of DNN and Seq2Seq, and predicts 3 days (D+0, D+1, D+2) using 6-hour and 1-hour average input data, respectively. The PM_2.5 concentrations predicted by the machine learning model and the numerical model were compared with the PM_2.5 measurements.

The machine learning models were trained for input data from 2015 to 2020, and their PM forecasting performance was tested for 2021. Compared to the numerical model, the machine learning model tended to increase ACC and be similar or lower to FAR and POD.

Time series trend was showed machine learning PM forecasting trend is more similar to PM measurements compared with numerical model. Especially, machine learning forecasting model can appropriately predict PM low and high concentrations that numerical model is used to overestimate.

Machine learning forecasting model with DNN and Seq2Seq can found improvement of PM forecasting performance compared with numerical forecasting model. However, the machine learning model has limitations that the model can not consider external inflow effects.

In order to overcome the drawback, the models should be updated and added some other machine learning module such as CNN with spatial features of PM concentrations.

Acknowledgements

This study was supported in part by the ‘Experts Training Graduate Program for Particulate Matter Management’ from the Ministry of Environment, Korea and by a grant from the National Institute of Environmental Research (NIER), funded by the Ministry of Environment (ME) of the Republic of Korea (NIER-2022-04-02-068).

How to cite: Wang, K.-H., Lee, C.-Y., Lee, J.-Y., Jung, M.-W., Kim, D.-G., Han, S.-H., Choi, D.-R., and Yun, H.: Comparison of PM2.5 concentrations prediction model performance using Artificial Intelligence, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12566, https://doi.org/10.5194/egusphere-egu23-12566, 2023.

EGU23-13013 | ECS | Posters on site | ITS1.13/AS5.2

Using cGAN for cloud classification from RGB pictures

Markus Rosenberger, Manfred Dorninger, and Martin Weißmann

Clouds of all kinds play a large role in many atmospheric processes including, e.g. radiation and moisture transport, and their type allows an insight into the dynamics going on in the atmosphere. Hence, the observation of clouds from Earth's surface has always been important to analyse the current weather and its evolution during the day. However, cloud observations by human observers are labour-intensive and hence also costy. In addition to this, cloud classifications done by human observers are always subjective to some extent. Finding an efficient method for automated observations would solve both problems. Although clouds have already been operationally observed using satellites for decades, observations from the surface shed a light on a different set of characteristics. Moreover, the WMO also defined their cloud classification standards according to visual cloud properties when observations are done at the Earth’s surface. Thus, in this work a utilization of machine learning methods to classify clouds from RGB pictures taken at the surface is proposed. Explicitly, a conditional Generative Adversarial Network (cGAN) is trained to discriminate between 30 different categories, 10 for each cloud level - low, medium and high; Besides showing robust results in different image classification problems, an additional advantage of using a GAN instead of a classical convolutional neural network is that its output can also artificially enhance the size of the training data set. This is especially useful if the number of available pictures is unevenly distributed among the different classes. Additional background observations like cloud cover and cloud base height can also be used to further improve the performance of the cGAN. Together with a cloud camera, a properly trained cGAN can observe and classify clouds with a high temporal resolution of the order of seconds, which can be used, e.g. for model verification or to efficiently monitor the current status of the weather as well as its short-time evolution. First results will also be presented.

How to cite: Rosenberger, M., Dorninger, M., and Weißmann, M.: Using cGAN for cloud classification from RGB pictures, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13013, https://doi.org/10.5194/egusphere-egu23-13013, 2023.

EGU23-13143 | ECS | Posters on site | ITS1.13/AS5.2

Comparison of LSTM, GraphNN, and IrradPhyDNet based Approaches for High-resolution Solar Irradiance Nowcasting

Petrina Papazek, Irene Schicker, and Pascal Gfähler

With fast parallel computing hardware, particularly GPUs, becoming more accessible in the geosciences the now efficiently running deep learning techniques are ready to handle larger amounts of recorded observation and satellite derived data and are able to learn complex structures across time-series. Thus, a suitable deep learning setup is able to generate highly-resolved weather forecasts in real-time and on demand. Forecasts of irradiance and radiation can be challenging in machine learning as they embrace a high degree of diurnal and seasonal variation.

Continuously extended PV/solar power production grows into one of our most important fossil-fuel free energy sources. Unlike the just recently emerging PV power observations, solar irradiance offers long time-series from automized weather station networks. Being directly linked to PV outputs, forecasting highly resolved solar irradiance from nowcasting to short-range plays a crucial role in decision support and managing PV.

In this study, we investigate the suitability of several deep learning techniques adopted and developed to a set of heterogeneous data sources on selected locations. We compare the forecast results to traditional – however computationally expensive - numerical weather prediction models (NWP) and rapid update cycle models. Relevant input features include 3D-fields from NWP models (e.g.: AROME), satellite data and products (e.g.: CAMS), radiation time series from remote sensing, and observation time time-series (site observations and close sites). The amount of time-series data can be extended by a synthetic data generator, a part of our deep learning framework. Our main models investigated includes a sequence-to-sequence LSTM (long-short-term-memory) model using a climatological background model or NWP for post-processing, a Graph NN model, and an analogs based deep learning method. Furthermore, a novel neural network model based on two other ideas, the IrradianceNet and the PhyDNet, was developed. IrradPhyDNet combines the skills of IrradianceNet and PhyDNet and showed improved performance in comparison to the original models.

Results obtained by the developed methods yield, in general, high forecast-skills. For selected case studies of extreme events (e.g. Saharan dust) all novel methods could outperform the traditional methods. Different combinations of inputs and processing-steps are part of the analysis.

How to cite: Papazek, P., Schicker, I., and Gfähler, P.: Comparison of LSTM, GraphNN, and IrradPhyDNet based Approaches for High-resolution Solar Irradiance Nowcasting, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13143, https://doi.org/10.5194/egusphere-egu23-13143, 2023.

EGU23-13322 | ECS | Posters on site | ITS1.13/AS5.2

Nodal Ambient Noise Tomography and automatic picking of dispersion curves with convolutional neural network: case study at Vulcano-Lipari, Italy

Douglas Stumpp, Elliot Amir Jiwani-Brown, Célia Barat, Matteo Lupi, Francisco Muñoz, Thomas Planes, and Geneviève Savard

The ambient noise tomography (ANT) method is widely adopted to reconstruct shear-wave velocity anomalies and to generate high-resolution images of the crust and upper-mantle. A critical step in this process is the extraction of surface-wave dispersion curves from cross-correlation functions of continuous ambient noise recordings, which is traditionally performed manually on the dispersion spectrograms through human-machine interfaces. Picking of dispersion curves is sometimes prone to bias due to human interpretation. Furthermore, it is a laborious and time-consuming task that needs to be resolved in an automatized manner, especially when dealing with dense seismic network of nodal geophones where the large amount of generated data severely hinders manual picking approaches. In the last decade, several studies successfully employed machine learning methods in Earth Sciences and across many seismological applications. Early studies have shown versatile and reliable solutions by treating dispersion curve extraction as a visual recognition problem.

We review and adapt a specific machine learning approach, deep convolutional neural networks, for use on dispersion spectrograms generated with the usual frequency-time analysis (FTAN) processing on ambient noise cross-correlations. To train and calibrate the algorithm we use several available datasets acquired from previous experiments across different geological settings. The main dataset consists of records acquired with a dense local geophone network (150 short period stations sampling at 250 Hz) deployed for one month in October 2021. The dataset has been acquired during the volcanic unrest of the Vulcano-Lipari complex, Italy. The network also accounts for additional 17 permanent broadband stations (sampling at 100 Hz) maintained by the National Institute of Geophysics and Volcanology (INGV) in Italy. We evaluate the performance of the dispersion curves extraction algorithm. The automatically-picked dispersion curves will be used to construct a shear-wave velocity model of the Vulcano-Lipari magmatic plumbing system and the surrounding area of the Aeolian archipelago.

How to cite: Stumpp, D., Amir Jiwani-Brown, E., Barat, C., Lupi, M., Muñoz, F., Planes, T., and Savard, G.: Nodal Ambient Noise Tomography and automatic picking of dispersion curves with convolutional neural network: case study at Vulcano-Lipari, Italy, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13322, https://doi.org/10.5194/egusphere-egu23-13322, 2023.

EGU23-13367 | ECS | Posters on site | ITS1.13/AS5.2

Framework for creating daily semantic segmentation maps of classified eddies using SLA along-track altimetry data

Eike Bolmer, Adili Abulaitijiang, Luciana Fenoglio-Marc, Jürgen Kusche, and Ribana Roscher

Mesoscale eddies are gyrating currents in the ocean and have horizontal scales from 10 km up to 100 km and above. They transport water mass, heat, and nutrients and therefore are of interest among others to marine biologists, oceanographers, and geodesists. Usually, gridded sea level anomaly maps, processed from several radar altimetry missions, are used to detect eddies. However, operational processors create multi-mission (processing level 4) SLA grid maps with an effective spatiotemporal resolution far lower than their grid spacing and temporal resolution.

This drawback leads to erroneous eddy detection. We, therefore, investigate if the higher-resolution along-track data could be used instead to solve the problem of classifying the SLA observations into cyclonic, anticyclonic, or no eddies in a more accurate way than using processed SLA grid map products. With our framework, we aim to infer a daily two-dimensional segmentation map of classified eddies. Due to repeat cycles between 10 and 35 days and cross-track spacing of a few 10 km to a few 100 km, ocean eddies are clearly visible in altimeter observations but are typically covered only by a few ground tracks where the spatiotemporal context within the input data is highly variable each day. However conventional convolutional neural networks (CNNs) rely on data without varying gaps or jumps in time and space in order to use the intrinsic spatial or temporal context of the observations. Therefore, this is a challenge that needs to be addressed with a deep neural network that on the one hand utilizes the spatiotemporal context information within the modality of along-track data and on the other hand is able to output a two-dimensional segmentation map from data of varying sparsity. Our approach with our architecture Teddy is to use a transformer module to encode and process the spatiotemporal information along with the ground track's sea level anomaly data that produces a sparse feature map. This will then be fed into a sparsity invariant convolutional neural network in order to infer a two-dimensional segmentation map of classified eddies. Reference data that is used to train Teddy is produced by an open-source geometry-based approach (py-eddy-tracker [1]).

The focus of this presentation is on how we implemented this approach in order to derive two-dimensional segmentation maps of classified eddies with our deep neural network architecture Teddy from along-track altimetry. We show results and limitations for the classification of eddies using only along-track SLA data from the multi-mission level 3 product of the Copernicus Marine Environment Monitoring Service (CMEMS) within the 2017 - 2019 period for the Gulf Stream region. We find that using our methodology, we can create two-dimensional maps of classified eddies from along-track data without using preprocessed SLA grid maps.

[1] Evan Mason, Ananda Pascual, and James C. McWilliams, “A new sea surface height–based code for oceanic mesoscale eddy tracking,” Journal of Atmospheric and Oceanic Technology, vol. 31, no. 5, pp. 1181–1188, 2014.

How to cite: Bolmer, E., Abulaitijiang, A., Fenoglio-Marc, L., Kusche, J., and Roscher, R.: Framework for creating daily semantic segmentation maps of classified eddies using SLA along-track altimetry data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13367, https://doi.org/10.5194/egusphere-egu23-13367, 2023.

EGU23-13771 | Orals | ITS1.13/AS5.2

Machine Learning Emulation of 3D Shortwave Radiative Transfer for Shallow Cumulus Cloud Fields

Jui-Yuan Christine Chiu, Chen-Kuang Kevin Yang, Jake J. Gristey, Graham Feingold, and William I. Gustafson

Clouds play an important role in determining the Earth’s radiation budget. Despite their complex and three-dimensional (3D) structures, their interactions with radiation in models are often simplified to one-dimensional (1D), considering the time required to compute radiative transfer. Such a simplification ignores cloud Inhomogeneity and horizontal photon transport in radiative processes, which may be an acceptable approximation for low-resolution models, but can lead to significant errors and impact cloud evolution predictions in high-resolution simulations. Since model developments and operations are heading toward a higher resolution that is more susceptible to radiation errors, a fast and accurate 3D radiative transfer scheme becomes important and necessary. To address the need, we develop a machine-learning-based 3D radiative transfer emulator to provide surface radiation, shortwave fluxes at all layers, and heating rate profiles. The emulators are trained for highly heterogeneous shallow cumulus under different solar positions. We will discuss the performance of the emulators in accuracy and efficiency and discuss their potential applications.

How to cite: Chiu, J.-Y. C., Yang, C.-K. K., Gristey, J. J., Feingold, G., and Gustafson, W. I.: Machine Learning Emulation of 3D Shortwave Radiative Transfer for Shallow Cumulus Cloud Fields, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13771, https://doi.org/10.5194/egusphere-egu23-13771, 2023.

EGU23-14051 | ECS | Posters on site | ITS1.13/AS5.2

Multi-modal data assimilation of sea surface currents from AIS data streams and satellite altimetry using 4DVARNet

Simon Benaïchouche, Clément Le Goff, Brahim Boussidi, François Rousseau, and Ronan Fablet

Over the last decades, space oceanography missions, particularly altimeter missions, have greatly advanced our ability to observe sea surface dynamics. However, they still struggle to resolve spatial scales below ~ 100 km. On a global scale, sea surface current are derived from sea surface height by a geostrophical assumption. While future altimeter missions should improve the observation of sea surface height, the observation of sea surface current using altimetry techniques would remains indirect. In the other hands, recent works have considered the use of AIS (automated identification system) as a new mean to reconstruct sea surface current : AIS data streams provide an indirect observational models of total currents including ageostrophic phenomenas. In this work we consider the use of the supervised learning framework 4DVARNet, a supervised data driven approach that allow us to perform multi-modal experiments : We focus on an Observing System Simulation Experiment (OSSE) in a region of the Gulf-Stream and we show that the joint use of AIS and sea surface height (SSH) measurement could improve the reconstruction of sea surface current with respect to product derived solely from AIS or SSH observations in terms of physical and time scale resolved.

How to cite: Benaïchouche, S., Le Goff, C., Boussidi, B., Rousseau, F., and Fablet, R.: Multi-modal data assimilation of sea surface currents from AIS data streams and satellite altimetry using 4DVARNet, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14051, https://doi.org/10.5194/egusphere-egu23-14051, 2023.

EGU23-15183 | ECS | Orals | ITS1.13/AS5.2

Deep learning approximations of a CFD model for operational wind and turbulence forecasting

Margrethe Kvale Loe and John Bjørnar Bremnes

The Norwegian Meteorological Institute has for many years applied a CFD model to downscale operational NWP forecasts to 100-200m spatial resolution for wind and turbulence forecasting for about 20 Norwegian airports. Due to high computational costs, however, the CFD model can only be run twice per day, each time producing a 12-hour forecast. An approximate approach requiring far less compute resources using deep learning has therefore been developed. In this, the relation between relevant NWP forecast variables at grids of 2.5 km spatial resolution and wind and turbulence from the CFD model has been approximated using neural networks with basic convolutional and dense layers. The deep learning models have been trained on approximately two year of the data separately for each airport. The results show that the models are to a large extent able to capture the characteristics of their corresponding CDF simulations, and the method is in due time intended to fully replace the current operational solution.

How to cite: Loe, M. K. and Bremnes, J. B.: Deep learning approximations of a CFD model for operational wind and turbulence forecasting, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15183, https://doi.org/10.5194/egusphere-egu23-15183, 2023.

EGU23-15684 | ECS | Posters on site | ITS1.13/AS5.2

Semi-supervised feature-based learning for prediction of Mass Accumulation Rate of sediments

Naveenkumar Parameswaran, Everardo Gonzalez, Ewa Burwicz-Galerne, David Greenberg, Klaus Wallmann, and Malte Braack

Mass accumulation rates of sediments[g/cm²/yr] or sedimentation rates[cm/yr] on the seafloor are important to understand various benthic properties, like the rate of carbon sequestration in the seafloor and seafloor geomechanical stability. Several machine learning models, such as random forests, and k-Nearest Neighbours have been proposed for the prediction of geospatial data in marine geosciences, but face significant challenges such as the limited amount of labels for training purposes, skewed data distribution, a large number of features etc. Previous model predictions show deviation in the global sediment budget, a parameter used to determine a model's predicitve validity, revealing the lack of accurate representation of sedimentation rate by the state of the art models.

Here we present a semi-supervised deep learning methodology to improve the prediction of sedimentation rates, making use of around 9x10⁶ unlabelled data points. The semi-supervised neural network implementation has two parts: an unsupervised pretraining using an encoder-decoder network. The encoder with the optimized weights from the unsupervised training is then taken out and fitted with layers that lead to the target dimension. This network is then fine-tuned with 2782 labelled data points, which are observed sedimentation rates from peer-reviewed sources. The fine-tuned model then predicts the rate and quantity of sediment accumulating on the ocean floor, globally.

The developed semi-supervised neural network provide better predictions than supervised models trained only on labelled data. The predictions from the semi-supervised neural network are compared with that of the supervised neural network with and without dimensionality reduction(using Principle Component Analysis).

How to cite: Parameswaran, N., Gonzalez, E., Burwicz-Galerne, E., Greenberg, D., Wallmann, K., and Braack, M.: Semi-supervised feature-based learning for prediction of Mass Accumulation Rate of sediments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15684, https://doi.org/10.5194/egusphere-egu23-15684, 2023.

EGU23-15756 | ECS | Posters on site | ITS1.13/AS5.2

Physiography improvements in numerical weather prediction digital twin engines

Thomas Rieutord, Geoffrey Bessardon, and Emily Gleeson

The next generation of numerical weather prediction model (so-called digital twin engines) will reach hectometric scale, for which the existing physiography databases are insufficient. Our work leverages machine learning and open-access data to produce a more accurate and higher resolution physiography database. One component to improve is the land cover map. The reference data gathers multiple high-resolution thematic maps thanks to an agreement-based decision tree. The input data are taken from the Sentinel-2 satellite. Then, the land cover map generation is made with image segmentation. This work implements and compares several algorithms of different families to study their suitability to the land cover classification problem. The sensitivity to the data quality will also be studied. Compared to existing work, this work is innovative in the reference map construction (both leveraging existing maps and fit for end-user purpose) and the diversity of algorithms to produce our land cover map comparison.

How to cite: Rieutord, T., Bessardon, G., and Gleeson, E.: Physiography improvements in numerical weather prediction digital twin engines, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15756, https://doi.org/10.5194/egusphere-egu23-15756, 2023.

EGU23-15892 | ECS | Posters on site | ITS1.13/AS5.2

Towards emulated Lagrangian particle dispersion model footprints for satellite observations

Elena Fillola, Raul Santos-Rodriguez, and Matt Rigby

Lagrangian particle dispersion models (LPDMs) have been used extensively to calculate source-receptor relationships (“footprints”) for use in greenhouse gas (GHG) flux inversions. However, because a backward-running model simulation is required for each data point, LPDMs do not scale well to very large datasets, which makes them unsuitable for use in GHG inversions using high-resolution satellite instruments such as TROPOMI. In this work, we demonstrate how Machine Learning (ML) can be used to accelerate footprint production, by first presenting a proof-of-concept emulator for ground-based site observations, and then discussing work in progress to create an emulator suitable to satellite observations. In Fillola et al (2023), we presented a ML emulator for NAME, the Met Office’s LPDM, which outputs footprints for a small region around an observation point using purely meteorological variables as inputs. The footprint magnitude at each grid cell in the domain is modelled independently using gradient-boosted regression trees. The model is evaluated for seven sites, producing a footprint in 10ms, compared to around 10 minutes for the 3D simulator, and achieving R2 values between 0.6 and 0.8 for CH4 concentrations simulated at the sites when compared to the timeseries generated by NAME. Following on from this work, we demonstrate how this same emulator can be applied to satellite data to reproduce footprints immediately around any measurement point in the domain, evaluating this application with data for Brazil and North Africa and obtaining R2 values of around 0.5 for simulated CH4 concentrations. Furthermore, we propose new emulator architectures for LPDMs applied to satellite observations. These new architectures should tackle some of the weaknesses in the existing approach, for example, by propagating information more flexibly in space and time, potentially improving accuracy of the derived footprints and extending the prediction capabilities to bigger domains.

How to cite: Fillola, E., Santos-Rodriguez, R., and Rigby, M.: Towards emulated Lagrangian particle dispersion model footprints for satellite observations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15892, https://doi.org/10.5194/egusphere-egu23-15892, 2023.

EGU23-15994 | ECS | Posters on site | ITS1.13/AS5.2

Uncertainty quantification in variational data assimilation with deep learning

Nicolas Lafon, Philippe Naveau, and Ronan Fablet

The spatio-temporal reconstruction of a dynamical process from some observationaldata is at the core of a wide range of applications in geosciences. This is particularly true for weather forecasting, operational oceanography and climate studies. However, the re35 construction of a given dynamic and the prediction of future states must take into ac36 count the uncertainties that affect the system. Thus, the available observational measurements are only provided with a limited accuracy. Besides, the encoded physical equa38 tions that model the evolution of the system do not capture the full complexity of the real system. Finally, the numerical approximation generates a non-negligible error. For these reasons, it seems relevant to calculate a probability distribution of the state system rather than the most probable state. Using recent advances in machine learning techniques for inverse problems, we propose an algorithm that jointly learns a parametric distribution of the state, the dynamics governing the evolution of the parameters, and a solver. Experiments conducted on synthetic reference datasets, as well as on datasets describing environmental systems, validate our approach.

How to cite: Lafon, N., Naveau, P., and Fablet, R.: Uncertainty quantification in variational data assimilation with deep learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15994, https://doi.org/10.5194/egusphere-egu23-15994, 2023.

EGU23-16287 | ECS | Posters on site | ITS1.13/AS5.2

A machine learning emulator for forest carbon stocks and fluxes

Carolina Natel de Moura, David Martin Belda, Peter Antoni, and Almut Arneth

Forests are a significant carbon sink of the total carbon dioxide (CO₂) emitted by humans. Climate change is expected to impact forest systems, and their role in the terrestrial carbon cycle in several ways – for example, the fertilization effect of increased atmospheric CO₂, and the lengthening of the growing season in northern temperate and boreal areas may increase forest productivity, while more frequent extreme climate events such as storms and windthrows or drought spells, as well as wildfires might reduce disturbances return period, hence increasing forest land loss and reduction of the carbon stored in the vegetation and soils. In addition, forest management in response to an increased demand for wood products and fuel can affect the carbon storage in ecosystems and wood products. State-of-the-art Dynamic Global Vegetation Models (DGVMs) simulate the forest responses to environmental and human processes, however running these models globally for many climate and management scenarios becomes challenging due to computational restraints. Integration of process-based models and machine learning methods through emulation allows us to speed up computationally expensive simulations. In this work, we explore the use of machine learning to surrogate the LPJ-GUESS DGVM. This emulator is spatially-aware to represent forests across the globe in a flexible spatial resolution, and consider past climate and forest management practices to account for legacy effects. The training data for the emulator is derived from dedicated runs of the DGVM sampled across four dimensions relevant to forest carbon and yield: atmospheric CO₂concentration, air Temperature, Precipitation, and forest Management (CTPM). The emulator can capture relevant forest responses to climate and management in a lightweight form, and will support the development of the coupled socio-economic/ecologic model of the land system, namely LandSyMM (landsymm.earth). Other relevant scientific applications include the analysis of optimal forestry protocols under climate change, and the forest potential in climate change mitigation.

How to cite: Natel de Moura, C., Belda, D. M., Antoni, P., and Arneth, A.: A machine learning emulator for forest carbon stocks and fluxes, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16287, https://doi.org/10.5194/egusphere-egu23-16287, 2023.

EGU23-16597 | Posters on site | ITS1.13/AS5.2 | Highlight

Global Decadal Sea Surface Height Forecast with Conformal Prediction

Nils Lehmann, Jonathan Bamber, and Xiaoxiang Zhu

One of the many ways in which anthropogenic climate change impacts our planet is
rising sea levels. The rate of sea level rise (SLR) across the oceans is,
however, not uniform in space or time and is influenced by a complex interplay
of ocean dynamics, heat uptake, and surface forcing. As a consequence,
short-term (years to a decade) regional SLR patterns are difficult to model
using conventional deterministic approaches. For example, the latest climate
model projections (called CMIP6) show some agreement in the globally integrated
rate of SLR but poor agreement when it comes to spatially-resolved
patterns. However, such forecasts are valuable for adaptation planning in
coastal areas and for protecting low lying assets.
Rather than a deterministic modeling approach, here we explore the possibility
of exploiting the high quality satellite altimeter derived record of sea surface
height variations, which cover the global oceans outside of ice-infested waters
over a period of 30 years. Alongside this rich and unique satellite record,
several data-driven models have shown tremendous potential for various
applications in Earth System science. We explore several data-driven deep
learning approaches for sea surface height forecasts over multi-annual to
decadal time frames. A limitation of some machine learning approaches is the
lack of any kind of uncertainty quantification, which is problematic for
applications where actionable evidence is sought. As a consequence, we equip
our models with a rigorous measure of uncertainty, namely conformal prediction which
is a model and dataset agnostic method that provides calibrated predictive
uncertainty with proven coverage guarantees. Based on a 30-year satellite
altimetry record and auxiliary climate forcing data from reanalysis such as
ERA5, we demonstrate that our methodology is a viable and attractive alternative
for decadal sea surface height forecasts.

How to cite: Lehmann, N., Bamber, J., and Zhu, X.: Global Decadal Sea Surface Height Forecast with Conformal Prediction, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16597, https://doi.org/10.5194/egusphere-egu23-16597, 2023.

EGU23-16936 | ECS | Orals | ITS1.13/AS5.2

Analysis of marine heat waves using machine learning

Said Ouala, Bertrand Chapron, Fabrice Collard, Lucile Gaultier, and Ronan Fablet

Sea surface temperature (SST) is a critical parameter in the global climate system and plays a vital role in many marine processes, including ocean circulation, evaporation, and the exchange of heat and moisture between the ocean and atmosphere. As such, understanding the variability of SST is important for a range of applications, including weather and climate prediction, ocean circulation modeling, and marine resource management.

The dynamics of SST is the compound of multiple degrees of freedom that interact across a continuum of Spatio-temporal scales. A first-order approximation of such a system was initially introduced by Hasselmann. In his pioneering work, Hasselmann (1976) discussed the interest in using a two-scale stochastic model to represent the interactions between slow and fast variables of the global ocean, climate, and atmosphere system. In this paper, we examine the potential of machine learning techniques to derive relevant dynamical models of Sea Surface Temperature Anomaly (SSTA) data in the Mediterranean Sea. We focus on the seasonal modulation of the SSTA and aim to understand the factors that influence the temporal variability of SSTA extremes. Our analysis shows that the variability of the SSTA can indeed well be decomposed into slow and fast components. The dynamics of the slow variables are associated with the seasonal cycle, while the dynamics of the fast variables are linked to the SSTA response to rapid underlying processes such as the local wind variability. Based on these observations, we approximate the probability density function of the SSTA data using a stochastic differential equation parameterized by a neural network. In this model, the drift function represents the seasonal cycle and the diffusion function represents the envelope of the fast SSTA response.

How to cite: Ouala, S., Chapron, B., Collard, F., Gaultier, L., and Fablet, R.: Analysis of marine heat waves using machine learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16936, https://doi.org/10.5194/egusphere-egu23-16936, 2023.

EGU23-338 | ECS | Posters virtual | ITS1.1/NH0.1

A Stacking Ensemble Deep Learning Approach for Post Disaster Building Assessment using UAV Imagery

Leon Sim, Fang-Jung Tsai, and Szu-Yun Lin

Traditional post-disaster building damage assessments were performed manually by the response team, which was risky and time-consuming. With advanced remote sensing technology, such as an unmanned aerial vehicle (UAV), it would be possible to acquire high-quality aerial videos and operate at a variety of altitudes and angles. The collected data would be sent into a neural network for training and validating. In this study, the Object Detection model (YOLO) was utilized, which is capable of predicting both bounding boxes and damage levels. The network was trained using the ISBDA dataset, which was created from aerial videos of the aftermath of Hurricane Harvey in 2017, Hurricane Michael and Hurricane Florence in 2018, and three tornadoes in 2017, 2018, and 2019 in the United States. The Joint Damage Scale was used to classify the buildings in this dataset into four categories: no damage, minor damage, major damage, and destroyed. However, the number of major damage and destroyed classes are significantly lower than the number of no damage and minor damage classes in the dataset. Also, the damage characteristics of minor and major damage classes are similar under such type of disaster. These caused the YOLO model prone to misclassify the intermediate damage levels, i.e., minor and major damage in our earlier experiments. This study aimed to improve the YOLO model using a stacking ensemble deep learning approach with a image classification model called Mobilenet. First, the ISBDA dataset was used and refined to train the YOLO network and the Mobilenet network separately, and the latter provides two classes predictions (0 for no damage or minor damage, 1 for major damage or destroyed) rather than the four classes by the former. In the inference phase, the initial predictions from the trained YOLO network, including bounding box coordinates, confidence scores for four damage classes, and the predicted class, were then extracted and passed to the trained Mobilenet to generate the secondary predictions for each building. Based on the secondary predictions, two hyperparameters were utilized to refine the initial predictions by modifying the confidence scores of each class, and the hyperparameters were trained during this phase. Lastly, the trained hyperparameters were applied to the testing dataset to evaluate the performance of the proposed method. The results show that our stacking ensemble method could obtain more reliable predictions of intermediate classes.

How to cite: Sim, L., Tsai, F.-J., and Lin, S.-Y.: A Stacking Ensemble Deep Learning Approach for Post Disaster Building Assessment using UAV Imagery, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-338, https://doi.org/10.5194/egusphere-egu23-338, 2023.

EGU23-2564 | ECS | Posters on site | ITS1.1/NH0.1

Investigating the Correlation between the Characteristics of Seismic Activity and Environmental Variables in Taiwan

Yi-Ho Chen and Yuan-Chien Lin

Since Taiwan is located at the Pacific Ring of Fire, seismic activity of varying magnitudes occurs almost every day. Among them, some of these seismic activities have in turn caused severe disasters, resulting in loss of personal property, casualties and damage to important public facilities. Therefore, investigating the long-term spatiotemporal pattern of seismic activities is a crucial task for understanding the causes of seismic activity and to predict future seismic activity, in order to carry out disaster prevention measures in advance. Previous studies mostly focused on the causes of single seismic events on the small spatiotemporal scale. In this study, the data from 1987 to 2020 are used, including seismic events from the United States Geological Survey (USGS), the ambient environmental factors such as daily air temperature from Taiwan Central Weather Bureau (CWB) and daily sea surface temperature data from National Oceanic and Atmospheric Administration (NOAA). Then the temperature difference between the land air temperature and the sea surface temperature (SST) to the correlation between the occurrence of seismic activities and the abnormal occurrence of temperature difference are compared. The results show that lots of seismic activities often have positive and negative anomalies of temperature difference from 21 days before to 7 days after the seismic event. Moreover, there is a specific trend of temperature difference anomalies under different magnitude intervals. In the magnitude range of 2.5 to 4 and greater than 6, almost all of the seismic events have significant anomalous differences in the temperature difference between land air temperature and SST compared with no seismic events. This study uncovers anomalous frequency signatures of seismic activities and temperature differences between land air temperature and SST. The significant difference in temperature difference between seismic events and non-seismic events was compared by using statistical analysis. Additionally, the deep neural network (DNN) of deep learning model, logistic regression and random forest of machine learning model was used to identify whether there will be a seismic event under different magnitude intervals. It is hoped that it can provide relevant information for the prediction of future seismic activity, to more accurately prevent disasters that may be caused by seismic activity.

How to cite: Chen, Y.-H. and Lin, Y.-C.: Investigating the Correlation between the Characteristics of Seismic Activity and Environmental Variables in Taiwan, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2564, https://doi.org/10.5194/egusphere-egu23-2564, 2023.

EGU23-2996 | ECS | Posters virtual | ITS1.1/NH0.1

Development of a Rapid Seismic Loss Prediction Model for Residential Buildings using Machine Learning - Christchurch, New Zealand

Samuel Roeslin

The 2010-2011 Canterbury Earthquake sequence (CES) led to unprecedented building damage in the Canterbury region, New Zealand. Commercial and residential buildings were significantly affected. Due to New Zealand’s unique insurance setting, around 80% of the losses were covered by insurance (Bevere & Balz, 2012; King et al., 2014). The Insurance Council of New Zealand (ICNZ) estimated the total economic losses to be more than NZ$40 billion, with the Earthquake Commission (EQC) and private insurers covering NZ$10 billion and NZ$21 billion of the losses, respectively (ICNZ, 2021). As a result of the CES and the 2016 Kaikoura earthquake, EQC’s Natural Disaster Fund was depleted (EQC, 2022). This highlighted the need for improved tools enabling damage and loss analysis for natural hazards.
This research project used residential building claims collected by EQC following the CES to develop a rapid seismic loss prediction model for residential buildings in Christchurch. Geographic information systems (GIS) tools, data science techniques, and machine learning (ML) were used for the model development. Before the training of the ML model, the claims data was enriched with additional information from external data sources. The seismic demand, building characteristics, soil conditions, and information about the liquefaction occurrence were added to the claims data. Once merged and pre-processed, the aggregated data was used to train ML models based on the main events in the CES. Emphasis was put on the interpretability and explainability of the model. The ML model delivered valuable insights related to the most important features contributing to losses. Those insights are aligned with engineering knowledge and observations from previous studies, confirming the potential of using ML for disaster loss prediction and management. Care was also put into the retrainability of the model to ensure that any new data from future earthquake events can rapidly be added to the model.

How to cite: Roeslin, S.: Development of a Rapid Seismic Loss Prediction Model for Residential Buildings using Machine Learning - Christchurch, New Zealand, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2996, https://doi.org/10.5194/egusphere-egu23-2996, 2023.

EGU23-3928 | Orals | ITS1.1/NH0.1

Comparison of deep learning approaches to monitor trash screen blockage from CCTV cameras

Remy Vandaele, Sarah L Dance, and Varun Ojha

We investigate the use of CCTV cameras and deep learning to automatically monitor trash screen blockage.

Trash screens are installed to prevent debris from entering critical parts of river networks (pipes, tunnels, locks,...). When the debris piles up at the trash screens, it may block the waterway and can cause flooding. It is thus crucial to clean blocked trash screens and avoid flooding and consequent damage. Currently, the maintenance crews must manually check a camera or river level data or go on site to check the state of the screen to know if it needs cleaning. This wastes valuable time in emergency situations where blocked screens must be urgently cleaned (e.g., in case of forecast heavy rainfall). Some initial attempts at trying to predict trash screen blockage exist. However, these have not been widely adopted in practice. CCTV cameras can be easily installed at any location and can thus be used to monitor the state of trash screens, but the images need to be processed by an automated algorithm to inform whether the screen is blocked.

With the help of UK-based practitioners (Environment Agency and local councils), we have created a dataset of 40000 CCTV trash screen images coming from 36 cameras, each labelled with blockage information. Using this database, we have compared 3 deep learning approaches to automate the detection of trash screen blockage:

A binary image classifier, which takes as input a single image, and outputs a binary label that estimates whether the trash screen is blocked.
An approach based on anomaly detection which tries to reconstruct the input image with an auto-encoder trained on clean trash screen images. In consequence, blocked trash screens are detected as anomalies by the auto-encoder.
An image similarity estimation approach based on the use of a siamese network, which takes as input two images and outputs a similarity index related, in our case, to whether both images contain trash.

Using performance criteria chosen in discussion with practitioners (overall accuracy, false alarm rate, resilience to luminosity / moving fields of view, computing capabilities), we show that deep learning can be used in practice to automate the identification of blocked trash screens. We also analyse the strengths and weaknesses of each of these approaches and provide guidelines for their application.

How to cite: Vandaele, R., Dance, S. L., and Ojha, V.: Comparison of deep learning approaches to monitor trash screen blockage from CCTV cameras, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3928, https://doi.org/10.5194/egusphere-egu23-3928, 2023.

EGU23-4455 | ECS | Posters virtual | ITS1.1/NH0.1

Traffic Monitoring System Design considering Multi-Hazard Disaster Risks

Michele Gazzea, Reza Arghandeh, and Amir Miraki

Roadways are critical infrastructure in our society, providing services for people through and between cities. However, they are prone to closures and disruptions, especially after extreme weather events like hurricanes.

At the same time, traffic flow data are a fundamental type of information for any transportation system.

We tackle the problem of traffic sensor placement on roadways to address two tasks at the same time. The first task is traffic data estimation in ordinary situations, which is vital for traffic monitoring and city planning. We design a graph-based method to estimate traffic flow on roads where sensors are not present. The second one is enhanced observability of roadways in case of extreme weather events. We propose a satellite-based multi-domain risk assessment to locate roads at high risk of closures. Vegetation and flood hazards are taken into account. We formalize the problem as a search method over the network to suggest the minimum number and location of traffic sensors to place while maximizing the traffic estimation capabilities and observability of the risky areas of a city.

How to cite: Gazzea, M., Arghandeh, R., and Miraki, A.: Traffic Monitoring System Design considering Multi-Hazard Disaster Risks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4455, https://doi.org/10.5194/egusphere-egu23-4455, 2023.

EGU23-4757 | Posters on site | ITS1.1/NH0.1

Machine Learning-based Site Classification System for Earthquake-Induced Multi-Hazard in South Korea

Hansaem Kim

Earthquake-induced land deformation and structure failure are more severe over soft soils than over firm soils and rocks owing to the seismic site effect and liquefaction. The site-specific seismic site effect related to the amplification of ground motion, liquefaction, and landslide has spatial uncertainty depending on the local subsurface, surface geological, and topographic conditions. When the 2017 Pohang earthquake (M 5.4), South Korea’s second strongest earthquake in decades, occurred, the severe damages influenced by variable site response and vulnerability indicators were observed focusing on the basin or basin-edge region deposited unconsolidated Quaternary sediments. Thus, nationwide site characterization is essential considering empirical correlations with geotechnical site response and hazard parameters and surface proxies. Furthermore, in case of so many variables and tenuously related correlations, machine learning classification models can prove to be very precise than the parametric methods. This study established a multivariate seismic site classification system using the machine learning technique based on the geospatial big data platform.

The supervised machine learning classification techniques and more specifically, random forest, support vector machine (SVM), and artificial neural network (ANN) algorithms have been adopted. Supervised machine learning algorithms analyze a set of labeled training data consisting of a group of input data and desired output values. They produce an inferred function that can be used for predictions from given input data. To optimize the classification criteria by considering the geotechnical uncertainty and local site effects, the training datasets applying principal component analysis (PCA) were verified with k-fold cross-validation. Moreover, the optimized training algorithm, proved by loss estimators (receiver operating characteristic curve (ROC), the area under the ROC curve (AUC)) based on confusion matrix, was selected.

For the southeastern region in South Korea, the boring log information (strata, standard penetration test, etc.), geological map (1:50k scale), digital terrain model (having 5 m × 5 m), soil map (1:250k scale) were collected and constructed as geospatial big data. Preliminarily, to build spatially coincided datasets with geotechnical response parameters and surface proxies, the mesh-type geospatial information was built by advanced geostatistical interpolation and simulation methods.

Site classification systems use seismic hazard parameters related to the geotechnical characteristics of the study area as the classification criteria. The current site classification systems in South Korea and the United States recommend Vs30, which is the average shear wave velocity (Vs) up to 30 m underground. This criterion uses only the dynamic characteristics of the site without considering its geometric distribution characteristics. Thus, the geospatial information included the geo-layer thickness, surface proxies (elevation, slope, geological category, soil category), and Vs30. For the liquefaction and landslide hazard estimation, the liquefaction vulnerability indexes (i.e., liquefaction potential or severity index) and landslide vulnerability indexes (i.e., a factor of safety or displacement) were also trained as input features into the classifier modeling. Finally, the composite status against seismic site effect, liquefaction, and landslide was predicted as hazard class (I.e., safe, slight-, moderate-, extreme-failure) based on the best-fitting classifier.

How to cite: Kim, H.: Machine Learning-based Site Classification System for Earthquake-Induced Multi-Hazard in South Korea, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4757, https://doi.org/10.5194/egusphere-egu23-4757, 2023.

EGU23-4816 | ECS | Posters on site | ITS1.1/NH0.1

XAIDA4Detection: A Toolbox for the Detection and Characterization of Spatio-Temporal Extreme Events

Jordi Cortés-Andrés, Maria Gonzalez-Calabuig, Mengxue Zhang, Tristan Williams, Miguel-Ángel Fernández-Torres, Oscar J. Pellicer-Valero, and Gustau Camps-Valls

The automatic anticipation and detection of extreme events constitute a major challenge in the current context of climate change, which has changed their likelihood and intensity. One of the main objectives within the EXtreme Events: Artificial Intelligence for Detection and Attribution (XAIDA) project (https://xaida.eu/) is related to developing novel approaches for the detection and localization of extreme events, such as tropical cyclones and severe convective storms, heat waves and droughts, as well as persistent winter extremes, among others. Here we introduce the XAIDA4Detection toolbox that allows for tackling generic problems of detection and characterization. The open-source toolbox integrates a set of advanced ML models, ranging in complexity, assumptions, and sophistication, and yields spatio-temporal explicit detection maps with probabilistic heatmap estimates. We included supervised and unsupervised methods, deterministic and probabilistic, neural networks based on convolutional and recurrent nets, and density-based methods. The toolbox is intended for scientists, engineers, and students with basic knowledge of extreme events, outlier detection techniques, and Deep Learning (DL), as well as Python programming with basic packages (Numpy, Scikit-learn, Matplotlib) and DL packages (PyTorch, PyTorch Lightning). This presentation will summarize the available features and their potential to be adapted to multiple extreme event problems and use cases.

How to cite: Cortés-Andrés, J., Gonzalez-Calabuig, M., Zhang, M., Williams, T., Fernández-Torres, M.-Á., Pellicer-Valero, O. J., and Camps-Valls, G.: XAIDA4Detection: A Toolbox for the Detection and Characterization of Spatio-Temporal Extreme Events, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4816, https://doi.org/10.5194/egusphere-egu23-4816, 2023.

EGU23-5581 | Posters on site | ITS1.1/NH0.1

Vision Transformers for building damage assessment after natural disasters

Adrien Lagrange, Nicolas Dublé, François De Vieilleville, Aurore Dupuis, Stéphane May, and Aymeric Walker-Deemin

Damage assessment is a critical step in crisis management. It must be fast and accurate in order to organize and scale the emergency response in a manner adapted to the real needs on the ground. The speed requirements motivate an automation of the analysis, at least in support of the photo-interpretation. Deep Learning (DL) seems to be the most suitable methodology for this problem: on one hand for the speed in obtaining the answer, and on the other hand by the high performance of the results obtained by these methods in the extraction of information from images. Following previous studies to evaluate the potential contribution of DL methods for building damage assessment after a disaster, several conventional Deep Neural Network (DNN) and Transformers (TF) architectures were compared.

Made available at the end of 2019, the xView2 database appears to be the most interesting database for this study. It gathers images of disasters between 2011 and 2018 with 6 types of disasters: earthquakes, tsunamis, floods, volcanic eruptions, fires and hurricanes. For each of these disasters, pre- and post-disaster images are available with a ground truth containing the building footprint as well as the evaluation of the type of damage divided into 4 classes (no damage, minor damage, major damage, destroyed) similar to those considered in the study.

This study compares a wide range DNN architectures all based on an encoder-decoder structure. Two encoder families were implemented: EfficientNet (B0 to B7 configurations) and Swin TF (Tiny, Small, and Base configurations). Three adaptable decoders were implemented: UNet, DeepLabV3+, FPN. Finally, to benefit from both pre- and post-disaster images, the trained models were designed to proceed images with a Siamese approach: both images are processed independently by the encoder, and the extracted features are then concatenated by the decoder.

Taking benefit of global information (such as the type of disaster for example) present in the image, the Swin TF, associated with FPN decoder, reaches the better performances than all other encoder-decoder architectures. The Shifted WINdows process enables the pipe to process large images in a reasonable time, comparable to the processing time of EfficientNet-based architectures. An interesting additional result is that the models trained during this study do not seem to benefit so much from extra-large configurations, and both small and tiny configurations reach the highest scores.

How to cite: Lagrange, A., Dublé, N., De Vieilleville, F., Dupuis, A., May, S., and Walker-Deemin, A.: Vision Transformers for building damage assessment after natural disasters, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5581, https://doi.org/10.5194/egusphere-egu23-5581, 2023.

EGU23-5778 | ECS | Posters on site | ITS1.1/NH0.1

A novel approach for infrastructural disaster damage assessment using high spatial resolution satellite and UAV imageries using deep learning algorithms.

Saurabh Gupta and Syam Nair

Natural and man-made disasters pose a threat to human life, flora-fauna, and infrastructure. It is critical to detect the damage quickly and accurately for infrastructures right after the occurrence of any disaster. The detection and assessment of infrastructure damage help manage financial strategy as well. Recently, many researchers and agencies have made efforts to create high-resolution satellite imageries database related to pre and post-disaster events. The advanced remote sensing satellite imageries can reflect the surface of the earth accurately up to 30 cm spatial resolution on a daily basis. These high spatial resolutions (HSR) imageries can help access any natural hazard's damage by comparing the pre- and post-disaster data. These remote sensing imageries have limitations, such as cloud occlusions. Building under a thick cloud cannot be recognised in optical images. The manual assessment of the severity of damage to buildings/infrastructure by comparing bi-temporal HSR imageries or airborne will be a tedious and subjective job. On the other hand, the emerging use of unmanned aired vehicles (UAV) can be used to assess the situation precisely. The high-resolution UAV imageries and the HSR satellite imageries can complement each other for critical infrastructure damage assessment. In this study, a novel approach is used to integrate UAV data into HSR satellite imageries for the building damage assessment using a convolution neural network (CNN) based deep learning model. The research work is divided into two fundamental sub-tasks: first is the building localisation in the pre-event images, and second is the damage classification by assigning a unique damage level label reflecting the degree of damage to each building instance on the post-disaster images. For the study, the HSR satellite imageries of 36 pairs of pre- and post natural hazard events is acquired for the year 2021-22, similarly available UAV based data for these events is also collected from the open data source. The data is then pre-processed, and the building damage is assessed using a deep object-based semantic change detection framework (ChangeOS). The mentioned model was trained on the xview2 building damage assessment datasets comprised of ~20,000 images with ~730,000 building polygons of pre and post disaster events over the globe from 2011-2018. The experimental setup in this study includes training on the global dataset and testing on the regional-scale building damage assessment using HSR satellite imageries and local-scale using UAV imageries. The result obtained from the bi-temporal assessment of HSR images for the Indonesia Earthquake 2022 has shown an F1 score of ~67%, while the Uttarakhand flooding event 2021 has reported an F1 score of ~64%. The HSR imageries from the UAV Haiti earthquake event in 2011 have also shown less but promising F1 scores of ~54%. It is inferred that merging HSR imageries from satellite and UAV for building damage assessment using the ChangeOS framework represents a robust tool to further promote future research in infrastructure maintenance strategy and policy management in disaster response.

How to cite: Gupta, S. and Nair, S.: A novel approach for infrastructural disaster damage assessment using high spatial resolution satellite and UAV imageries using deep learning algorithms., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5778, https://doi.org/10.5194/egusphere-egu23-5778, 2023.

EGU23-5913 | ECS | Orals | ITS1.1/NH0.1

Pluto: A global volcanic activity early warning system powered by large scale self-supervised deep learning on InSAR data

Nikolaos Ioannis Bountos, Dimitrios Michail, Themistocles Herekakis, Angeliki Thanasou, and Ioannis Papoutsis

Artificial intelligence (AI) methods have emerged as a powerful tool to study and in some cases forecast natural disasters [1,2]. Recent works have successfully combined deep learning modeling with scientific knowledge stemming from the SAR Interferometry domain propelling research on tasks like volcanic activity monitoring [3], associated with ground deformation. A milestone in this interdisciplinary field has been the release of the Hephaestus [4] InSAR dataset, facilitating automatic InSAR interpretation, volcanic activity localization as well as the detection and categorization of atmospheric contributions in wrapped interferograms. Hephaestus contains annotations for approximately 20,000 InSAR frames, covering the 44 most active volcanoes in the world. The annotation was performed by a team of InSAR experts that manually examined each InSAR frame individually. However, even with such a large dataset, class imbalance remains a challenge, i.e. the InSAR samples containing volcano deformation fringes are orders of magnitude less than those that do not. This is anticipated since natural hazards are in principle rare in nature. To counter that, the authors of Hephaestus provide more than 100,000 unlabeled InSAR frames to be used for global large-scale self-supervised learning, which is more robust to class imbalance when compared to supervised learning [5].

Motivated by the Hephaestus dataset and the insights provided by [2], we train global, task-agnostic models in a self-supervised learning fashion that can handle distribution shifts caused by spatio-temporal variability as well as major class imbalances. By finetuning such a model to the labeled part of Hephaestus we obtain the backbone for a global volcanic activity alerting system, namely Pluto. Pluto is a novel end-to-end AI based system that provides early warnings of volcanic unrest on a global scale.

Pluto automatically synchronizes its database with the Comet-LiCS [6] portal to receive newly generated Sentinel-1 InSAR data acquired over volcanic areas. The new samples are fed to our volcanic activity detection model. If volcanic activity is detected, an automatic email is sent to the service users, which contains information about the intensity, the exact location and the type (Mogi, Sill, Dyk) of the event. To ensure a robust and ever-improving service we augment Pluto with an iterative pipeline that collects samples that were misclassified in production, and uses them to further improve the existing model.

[1] Kondylatos et al. "Wildfire danger prediction and understanding with Deep Learning." Geophysical Research Letters 49.17 (2022): e2022GL099368.

[2] Bountos et al. "Self-supervised contrastive learning for volcanic unrest detection." IEEE Geoscience and Remote Sensing Letters 19 (2021): 1-5.

[3] Bountos et al. "Learning from Synthetic InSAR with Vision Transformers: The case of volcanic unrest detection." IEEE Transactions on Geoscience and Remote Sensing (2022).

[4] Bountos et al. "Hephaestus: A large scale multitask dataset towards InSAR understanding." Proceedings of the IEEE/CVF CVPR. 2022.

[5] Liu et al. "Self-supervised learning is more robust to dataset imbalance." arXiv preprint arXiv:2110.05025 (2021).

[6] Lazecký et al. "LiCSAR: An automatic InSAR tool for measuring and monitoring tectonic and volcanic activity." Remote Sensing 12.15 (2020): 2430.

How to cite: Bountos, N. I., Michail, D., Herekakis, T., Thanasou, A., and Papoutsis, I.: Pluto: A global volcanic activity early warning system powered by large scale self-supervised deep learning on InSAR data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5913, https://doi.org/10.5194/egusphere-egu23-5913, 2023.

EGU23-6522 | Posters virtual | ITS1.1/NH0.1

A new approach for hazard and disaster prevention: deep learning algorithms for change detection and classification RADAR/SAR

Ilaria Pennino

It has become increasingly apparent over the past few decades that environmental degradation is something of a common concern for humanity and it is difficult to deny that the present environmental problems are caused primarily by anthropogenic activities rather than natural causes.

To minimize disaster’s risk, the role of geospatial science and technology may be a terribly helpful and necessary technique for hazard zone mapping throughout emergency conditions.

This approach can definitively help predict harmful events, but also to mitigate damage to the environment from events that cannot be efficiently predicted.

With detailed information obtained through various dataset, decision making has become simpler. This fact is crucial for a quick and effective response to any disaster. Remote sensing, in particular RADAR/SAR data, help in managing a disaster at various stages.

Prevention for example refers to the outright avoidance of adverse impacts of hazards and related disasters; preparedness refers to the knowledge and capacities to effectively anticipate, respond to, and recover from, the impacts of likely, imminent or current hazard events or conditions.

Finally relief is the provision of emergency services after a disaster in order to reduce damage to environment and people.

Thanks to the opportunity proposed by ASI (Italian Space Agency) to use COSMO-SkyMed data, in NeMeA Sistemi srl we developed two projects: “Ventimiglia Legalità”, “Edilizia Spontanea” and 3xA.

Their main objective is to detect illegal buildings not present in the land Legal registry.

We developed new and innovative technologies using integrated data for the monitoring and protection of environmental and anthropogenic health, in coastal and nearby areas.

3xA project addresses the highly challenging problem of automatically detecting changes from a time series of high-resolution synthetic aperture radar (SAR) images. In this context, to fully leverage the potential of such data, an innovative machine learning based approach has been developed.

The project is characterized by an end-to-end training and inference system which takes as input two raw images and produces a vectorized change map without any human supervision.

More into the details, it takes as input two SAR acquisitions at time t1 and t2, the acquisitions are firstly pre-processed, homogenised and finally undergo a completely self-supervised algorithm which takes advantage of DNNs to classify changed/unchanged areas. This method shows promising results in automatically producing a change map from two input SAR images (Stripmap or Spotlight COSMO-SkyMed data), with 98% accuracy.

Being the process automated, results are produced faster than similar products generated by human operators.

A similar approach has been followed to create an algorithm which performs semantic segmentation from the same kind of data.

This time, only one of the two SAR acquisitions is taken as input for pre-processing steps and then for a supervised neural network. The result is a single image where each pixel is labelled with the class predicted by the algorithm.

Also in this case, results are promising, reaching around 90% of accuracy.

How to cite: Pennino, I.: A new approach for hazard and disaster prevention: deep learning algorithms for change detection and classification RADAR/SAR, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6522, https://doi.org/10.5194/egusphere-egu23-6522, 2023.

EGU23-6790 | ECS | Posters on site | ITS1.1/NH0.1

Deep learning for automatic flood mapping from high resolution SAR images

Arnaud Dupeyrat, abdullah Almaksour, Joao Vinholi, and tapio friberg

With the gradual warming of the global climate, natural catastrophes have caused billions of dollars in damage to ecosystems, economies and properties. Along with the damage, the loss of life is a very serious possibility. With the unprecedented growth of the human population, large-scale development activities and changes to the natural environment, the frequency, and intensity of extreme natural events and consequent impacts are expected to increase in the future.

To be able to mitigate and to reduce the potential damage of the natural catastrophe, continuous monitoring is required. The collection of data using earth observation (EO) systems has been valuable for tracking the effects of natural hazards, especially with their near real-time capabilities for tracking extreme natural events. Remote sensing systems from different platforms also serve as an important decision support tool for devising response strategies, coordinating rescue operations, and making damage and loss estimations.

Synthetic aperture radar (SAR) imagery provides highly valuable information about our planet that no other technology is capable of. SAR sensors emit their own energy to illuminate objects or areas on Earth and record what’s reflected back from the surface to the sensor. This allows data acquisition day and night since no sunlight is needed. SAR also uses longer wavelengths than optical systems, which gives it the unsurpassed advantage of being able to penetrate clouds, rain, fog and smoke. All of this makes SAR imagery unprecedentedly valuable in sudden events and crisis situations requiring a rapid response.

In this talk we will be focusing on flood monitoring using our ICEYE SAR images, taking into account multi-satellites, multi-angles and multi-resolutions that are inherent from our constellation and capabilities. We will present the different steps necessary that have allowed us to improve the consistency of our generated flood maps.

How to cite: Dupeyrat, A., Almaksour, A., Vinholi, J., and friberg, T.: Deep learning for automatic flood mapping from high resolution SAR images, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6790, https://doi.org/10.5194/egusphere-egu23-6790, 2023.

EGU23-7435 | Posters on site | ITS1.1/NH0.1

A study on the establishment of computer vision for disaster identification based on existing closed-circuit television system

Boris Chen and Che-Yuan Li

Increasing climatic extremes resulted in frequency and severity of urban flood events during the last several decades. Significant economic losses were point out the urgency of flood response. In recent years, the government gradually increased the layout of CCTV water level monitoring facilities for the purpose of decision-making in flood event. However, it is difficult for decision makers to recognize multiple images in the same time. Therefore, the aim of this study attempts to establish an automatic water level recognition method for given closed-circuit television (CCTV) system.

In the last years, many advances have been made in the area of automatic image recognition with methods of artificial intelligence. Little literature has been published on real-time water level recognition of closed-circuit television system for disaster management. The purpose of this study is to examine the possibilities in practice of artificial intelligence for real-time water level recognition with deep convolutional neural network. Proposed methodology will demonstrate with several case studies in Taichung. For the potential issue that AI models may lacks of learning target, the generative adversarial network (GAN) may be adopted for this study. The result of this study could be useful to decision makers responsible for organizing response assignments during flood event.

How to cite: Chen, B. and Li, C.-Y.: A study on the establishment of computer vision for disaster identification based on existing closed-circuit television system, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7435, https://doi.org/10.5194/egusphere-egu23-7435, 2023.

EGU23-8419 | ECS | Orals | ITS1.1/NH0.1

Synthetic Generation of Extra-Tropical Cyclones’ fields with Generative Adversarial Networks

Filippo Dainelli, Riccardo Taormina, Guido Ascenso, Enrico Scoccimarro, Matteo Giuliani, and Andrea Castelletti

Extra-Tropical Cyclones are major systems ruling and influencing the atmospheric structure at mid-latitudes. They are characterised by strong winds and heavy precipitation, and can cause considerable storm surges potentially devastating for coastal regions. The availability of historical observations of the extreme events caused by intense ETCs are rather limited, hampering risk evaluation. Increasing the amount of significant data available would substantially help several fields of analysis influenced by these events, such as coastal management, agricultural production, energy distribution, air and maritime transportation, and risk assessment and management.

Here, we address the possibility of generating synthetic ETC atmospheric fields of mean sea level pressure, wind speed, and precipitation in the North Atlantic by training a Generative Adversarial Network (GAN). The purpose of GANs is to learn the distribution of a training set based on a game theoretic scenario where two networks compete against each other, the generator and the discriminator. The former is trained to generate synthetic examples that are plausible and resemble the real ones. The input of the generator is a vector of random Gaussian values, whose domain is known as the “latent space”. The discriminator learns to distinguish whether an example comes from the dataset distribution. The competition set by the game-theoretic approach improves the network until the counterfeits are indistinguishable from the originals.

To train the GAN, we use atmospheric fields extracted from the ERA5 reanalysis dataset in the geographic domain with boundaries 0°- 90°N, 70°W - 20°E and for the period 1^st January 1979 - 1^st January 2020. We analyse the generated samples’ histograms, the samples’ average fields, the Wasserstein distance and the Kullback-Leibler divergence between the generated samples and the test set distributions. Results show that the generative model has learned the distribution of the values of the atmospheric fields and the general spatial trends of the atmosphere in the domain. To evaluate better the atmospheric structure learned by the network, we perform linear and spherical interpolations in the latent space. Specifically, we consider four cyclones and compare the frames of their tracks to those of the synthetic tracks generated by interpolation. The interpolated tracks show interesting features consistent with the original tracks. These findings suggest that GANs can learn meaningful representations of the ETCs’ fields, encouraging further investigations to model the tracks’ temporal evolution.

How to cite: Dainelli, F., Taormina, R., Ascenso, G., Scoccimarro, E., Giuliani, M., and Castelletti, A.: Synthetic Generation of Extra-Tropical Cyclones’ fields with Generative Adversarial Networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8419, https://doi.org/10.5194/egusphere-egu23-8419, 2023.

EGU23-8944 | ECS | Orals | ITS1.1/NH0.1

Towards probabilistic impact-based drought risk analysis – a case study on the Volta Basin

Marthe Wens, Raed Hamed, Hans de Moel, Marco Massabo, and Anna Mapelli

Understanding the relationships between different drought drivers and observed drought impact can provide important information for early warning systems and drought management planning. Moreover, this relationship can help inform the definition and delineation of drought events. However, currently, drought hazards are often characterized based on their frequency of occurring, rather than based on the impacts they cause. A more data-driven depiction of “impactful drought events”- whereby droughts are defined by the hydrometeorological conditions that, in the past, have led to observable impacts-, has the potential to be more meaningful for drought risk assessments.

In our research, we apply a data-mining method based on association rules, namely fast and frugal decision trees, to link different drought hazard indices to agricultural impacts. This machine learning technique is able to select the most relevant drought hazard drivers (among both hydrological and meteorological indices) and their thresholds associated with “impactful drought events”. The technique can be used to assess the likelihood of occurrence of several impact severities, hence it supports the creation of a loss exceedance curve and estimates of average annual loss. An additional advantage is that such data-driven relations in essence reflect varying local drought vulnerabilities which are difficult to quantify in data-scarce regions.

This contribution exemplifies the use of fast and frugal decision trees to estimate (agricultural) drought risk in the Volta basin and its riparian countries. We find that some agriculture-dependent regions in Ghana, Togo and Côte d’Ivoire face annual average drought-induced maize production losses up to 3M USD, while per hectare, losses can mount to on average 50 USD/ha per year in Burkina Faso. In general, there is a clear north-south gradient in the drought risk, which we find augmented under projected climate conditions. Climate change is estimated to worsen the drought impacts in the Volta Basin, with 11 regions facing increases in annual average losses of more than 50%.

We show that the proposed multi-variate, impact-based, non-parametric, machine learning approach can improve the evaluation of droughts, as this approach directly leverages observed drought impact information to demarcate impactful drought events. We evidence that the proposed technique can support quantitative drought risk assessments which can be used for geographic comparison of disaster losses at a sub-national scale.

How to cite: Wens, M., Hamed, R., de Moel, H., Massabo, M., and Mapelli, A.: Towards probabilistic impact-based drought risk analysis – a case study on the Volta Basin, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8944, https://doi.org/10.5194/egusphere-egu23-8944, 2023.

EGU23-9091 | Orals | ITS1.1/NH0.1

Improving near real-time flood extraction pipeline from SAR data using deep learning

Mathieu Turgeon-Pelchat, Heather McGrath, Fatemeh Esfahani, Simon Tolszczuk-Leclerc, Thomas Rainville, Nicolas Svacina, Lingjun Zhou, Zarrin Langari, and Hospice Houngbo

The Canada Centre for Mapping and Earth Observation (CCMEO) uses Radarsat Constellation Mission (RCM) data for near-real time flood mapping. One of the many advantages of using SAR sensors, is that they are less affected by the cloud coverage and atmospheric conditions, compared to optical sensors. RCM has been used operationally since 2020 and employs 3 satellites, enabling lower revisit times and increased imagery coverage. The team responsible for the production of flood maps in the context of emergency response are able to produce maps within four hours from the data acquisition. Although the results from their automated system are good, there are some limitations to it, requiring manual intervention to correct the data before publication. Main limitations are located in urban and vegetated areas. Work started in 2021 to make use of deep learning algorithms, namely convolutional neural networks (CNN), to improve the performances of the automated production of flood inundation maps. The training dataset make use of the former maps created by the emergency response team and is comprised of over 80 SAR images and corresponding digital elevation model (DEM) in multiple locations in Canada. The training and test images were split in smaller tiles of 256 x 256 pixels, for a total of 22,469 training tiles and 6,821 test tiles. Current implementation uses a U-Net architecture from NRCan geo-deep-learning pipeline (https://github.com/NRCan/geo-deep-learning). To measure performance of the model, intersection over union (IoU) metric is used. The model can achieve 83% IoU for extracting water and flood from background areas over the test tiles. Next steps include increasing the number of different geographical contexts in the training set, towards the integration of the model into production.

How to cite: Turgeon-Pelchat, M., McGrath, H., Esfahani, F., Tolszczuk-Leclerc, S., Rainville, T., Svacina, N., Zhou, L., Langari, Z., and Houngbo, H.: Improving near real-time flood extraction pipeline from SAR data using deep learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9091, https://doi.org/10.5194/egusphere-egu23-9091, 2023.

EGU23-9426 | ECS | Orals | ITS1.1/NH0.1

Fire hazard modelling with remote sensing data for South America

Johanna Strebl, Julia Gottfriedsen, Dominik Laux, Max Helleis, and Volker Tresp

Throughout the past couple years, changes in global climate have been turning wildfires into an increasingly unpredictable phenomenon. Many environmental parameters that have been linked to wildfires, such as the number of consecutive hot days, are becoming increasingly unstable. This leads to a twofold problem: adequate fire risk assessment is at the same time more important and more difficult than ever.

In the past, physical models were the prevalent approach to most questions in the domain of wildfire science. While they tend to provide accurate and transparent results, they require domain expertise and often tedious manual data collection.

In recent years, increased computation capabilities and the improved availability of remote sensing data associated with the new space movement have made deep learning a beneficial approach. Data-driven approaches often yield state of the art performance without requiring expert knowledge at a fraction of the complexity of physical models. The downside, however, is that they are often intransparent and offer no insights into their inner algorithmic workings.

We want to shed some light on this interpretability/performance tradeoff and compare different approaches for predicting wildfire hazard. We evaluate their strengths and weaknesses with a special focus on explainability. We built a wildfire hazard model for South America based on a spatiotemporal CNN architecture that infers fire susceptibility from environmental conditions that led to fire in the past. The training data used contains selected ECMWF ERA5 Land variables and ESA world cover information. This means that our model is able to learn from actual fire conditions instead of relying on theoretical frameworks. Unlike many other models, we do not make simplifying assumptions such as a standard fuel type, but calculate hazard ratings based on actual environmental conditions. Compared to classical fire hazard models, this approach allows us to account for regional and atypical fire behavior and makes our model readily adaptable and trainable for other ecosystems, too.

The ground truth labels are derived from fusing active fire remote sensing data from 20 different satellites into one active wildfire cluster data set. The problem itself is highly imbalanced with non-fire pixels making up 99.78% of the training data. Therefore we evaluate the ability of our model to correctly predict wildfire hazard using metrics for imbalanced data such as PR-AUC and F1 score. We also compare the results against selected standard fire hazard models such as the Canadian Fire Weather Index (FWI).

In addition, we assess the computational complexity and speed of calculating the respective models and consider the accuracy/complexity/speed tradeoff of the different approaches. Furthermore, we aim to provide insights why and how our model makes its predictions by leveraging common explainability methods. This allows for insights into which factors tend to influence wildfire hazard the most and to optimize for relatively lightweight, yet performant and transparent architectures.

How to cite: Strebl, J., Gottfriedsen, J., Laux, D., Helleis, M., and Tresp, V.: Fire hazard modelling with remote sensing data for South America, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9426, https://doi.org/10.5194/egusphere-egu23-9426, 2023.

EGU23-11636 | ECS | Posters on site | ITS1.1/NH0.1

State-of-the-Art Review of Machine Learning Models in Civil Engineering: Based on DAMIE Classification Tree

Jaehyun Kim and Donghwi Jung

For recent years, Machine Learning (ML) models have been proven to be useful in solving problems of a wide variety of fields such as medical, economic, manufacturing, transportation, energy, education, etc. With increased interest in ML models and advances in sensor technologies, ML models are being widely applied even in civil engineering domain. ML model enables analysis of large amounts of data, automation, improved decision making and provides more accurate prediction. While several state-of-the-art reviews have been conducted in each sub-domain (e.g., geotechnical engineering, structural engineering) of civil engineering or its specific application problems (e.g., structural damage detection, water quality evaluation), little effort has been devoted to comprehensive review on ML models applied in civil engineering and compare them across sub-domains. A systematic, but domain-specific literature review framework should be employed to effectively classify and compare the models. To that end, this study proposes a novel review approach based on the hierarchical classification tree “D-A-M-I-E (Domain-Application problem-ML models-Input data-Example case)”. “D-A-M-I-E” classification tree classifies the ML studies in civil engineering based on the (1) domain of the civil engineering, (2) application problem, (3) applied ML models and (4) data used in the problem. Moreover, data used for the ML models in each application examples are examined based on the specific characteristic of the domain and the application problem. For comprehensive review, five different domains (structural engineering, geotechnical engineering, water engineering, transportation engineering and energy engineering) are considered and the ML application problem is divided into five different problems (prediction, classification, detection, generation, optimization). Based on the “D-A-M-I-E” classification tree, about 300 ML studies in civil engineering are reviewed. For each domain, analysis and comparison on following questions has been conducted: (1) which problems are mainly solved based on ML models, (2) which ML models are mainly applied in each domain and problem, (3) how advanced the ML models are and (4) what kind of data are used and what processing of data is performed for application of ML models. This paper assessed the expansion and applicability of the proposed methodology to other areas (e.g., Earth system modeling, climate science). Furthermore, based on the identification of research gaps of ML models in each domain, this paper provides future direction of ML in civil engineering based on the approaches of dealing data (e.g., collection, handling, storage, and transmission) and hopes to help application of ML models in other fields.

How to cite: Kim, J. and Jung, D.: State-of-the-Art Review of Machine Learning Models in Civil Engineering: Based on DAMIE Classification Tree, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11636, https://doi.org/10.5194/egusphere-egu23-11636, 2023.

EGU23-11756 * | Orals | ITS1.1/NH0.1 | Highlight

Digital twin computing for enhancing resilience of disaster response system

Shunichi Koshimura and Erick Mas

Digital twin is now recognized as digital copies of physical world's objects stored in digital space and utilized to simulate the sequences and consequences of target phenomena. By incorporating physical world’s data into the digital twin, developers and users have a full view of the target through real-time feedback. Recent advances in high-performance computing and large-scale data fusion of sensing and observations of both natural and social phenomena are enhancing applicability of digital twin paradigm to natural disaster research. Artificial intelligence (AI) and machine learning are also being applied more and more widely across the world and contributing as essential elements of digital twin. Those have significant implications for disaster response and recovery to hold out the promise of dramatically improving our understanding of disaster-affected areas and responses in real-time.

A project is underway to enhance resilience of disaster response systems by constructing "Disaster Digital Twin" to support disaster response team in the anticipated tsunami disaster. “Disaster Digital Twin” platform consists of a fusion of real-time hazard simulation, e.g. tsunami inundation forecast, social sensing to identify dynamic exposed population, and multi-agent simulation of disaster response activities to find optimal allocation or strategy of response efforts, and achieve the enhancement of disaster resilience.

To achieve the goal of innovating digital twin computing for enhancing disaster resilience, four preliminary results are shown;

(1) Developing nation-wide real-time tsunami inundation and damage forecast system. The priority target for forecasting is the Pacific coast of Japan, a region where Nankai trough earthquake is likely to occur.

(2) Establishing a real-time estimation of the number of exposed population in the inundation zone and clarifying the relationship between the exposed population and medical demand.

(3) Developing a reinforcement learning-based multi-agent simulation of medical activities in the affected areas with use of damage information, medical demands, and resources in the medical facilities to fid optimal allocation of medical response.

(4) Developing a digital twin computing platform to support disaster medical response activities and find optimal allocation of disaster medical services through what-if analysis of multi-agent simulation.

How to cite: Koshimura, S. and Mas, E.: Digital twin computing for enhancing resilience of disaster response system, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11756, https://doi.org/10.5194/egusphere-egu23-11756, 2023.

EGU23-12240 | ECS | Posters on site | ITS1.1/NH0.1

Classification Seismic Spectrograms from Deep Neural Network: Application to Alarm System of Post-failure Landslides

Jui-Ming Chang, Wei-An Chao, and Wei-Kai Huang

Daman Landslide had blocked one of the three cross-island roads in Taiwan, and a road section has been under control since last October. During the period, more than thousands of small-scale post-failures occurred whose irregular patterns affected the safety of engineering workers for slope protection construction and road users. Therefore, we installed one time-lapse camera and two geophones at the crown and closed to the toe of the Daman landslide, respectively to train a classification model to offer in-situ alarm. According to time-lapse photos, those post failures can be categorized into two types. One is rock/debris moving and stopping above the upper slope or road, named type I, and the other is the rock/debris going through the road to download slope, named type II. Type I was almost recorded by the crown station, and type II was shown by both stations with different arrival times and the toe station’ high-frequency signals gradually rising (up to 100 Hz). Those distinct features were exhibited by spectrograms. To keep characteristics simultaneously, we merge two stations’ spectrograms as one to indicate different types of post-failures. However, frequent earthquakes affect the performance of the landslide’s discrimination, which should be involved in the classification model. A total of three labels, type I, type II, and earthquake, contained more than 15,000 images of spectrogram, have been used for deep neural network (DNN) to be a two-station-based automatic classifier. Further, user-defined parameters for the specific frequency band within fixed time span windows, including a sum of power spectrogram density, the arrival time of peak amplitude, cross-correlation coefficient, and signal-to-noise ratio, have been utilized for the decision tree algorithm. Both model results benefit the automatic classifier for post-failure alarms and can readily extend to monitor other landslides with frequent post-failures by transfer learning.

How to cite: Chang, J.-M., Chao, W.-A., and Huang, W.-K.: Classification Seismic Spectrograms from Deep Neural Network: Application to Alarm System of Post-failure Landslides, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12240, https://doi.org/10.5194/egusphere-egu23-12240, 2023.

EGU23-12716 | ECS | Posters on site | ITS1.1/NH0.1

Investigating causal effects of anthropogenic factors on global fire modeling

Nirlipta Pande and Wouter Dorigo

Humans significantly control the natural environment and natural processes. Global fire ignitions are a prime example of how human actions change the frequency of occurrence of otherwise rare events like wildfires. However, human controls on fire ignition are insufficiently characterised by global fire models because impacts are often indirect, complex, and collinear. Hence, modelling fire activity while considering the complex relationships amongst the input variables and their effect on global ignitions is crucial to developing fire models reflecting the real world.

This presentation leverages causal inference and machine learning frameworks applied to global datasets of fire ignitions from Earth observations and potential drivers to uncover anthropogenic pathways on fire ignition. Potential fire controls include human predictors from Earth observations and statistical data combined with variables traditionally associated with fire activity, like weather, and vegetation abundance and state, derived from earth observations and models.

Our research models causal relationships between fire control variables and global ignitions using Directed Acyclic Graphs(DAGs). Here, every edge between variables symbolises a relation between them; the edge weight indicates the strength of the relationship, and the orientation of the edge between the variables signifies the cause-and-effect relationship between the variables. However, defining a fire ignition distribution using DAGs is challenging owing to the large combinatorial sample space and acyclicity constraint. We use Bayesian structure learning to make these approximations and infer the extent of human intervention when combined with climate variables and vegetation properties. Our research demonstrates the need for causal modelling and the inclusion of anthropogenic factors in global fire modelling.

How to cite: Pande, N. and Dorigo, W.: Investigating causal effects of anthropogenic factors on global fire modeling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12716, https://doi.org/10.5194/egusphere-egu23-12716, 2023.

EGU23-13083 | Orals | ITS1.1/NH0.1

Machine learning modelling of compound flood events

Agnieszka Indiana Olbert, Sogol Moradian, and Galal Uddin

Flood early warning systems are vital for preventing flood damages and for reducing disaster risks. Such systems are particularly important for forecasting compound events where multiple, often dependent flood drivers co-occur and interact. In this research an early warning system for prediction of coastal-fluvial floods is developed to provide a robust, cost-effective and time-efficient framework for management of flood risks and impacts. This three-step method combines a cascade of three linked models: (1) statistical model that determines probabilities of multiple-driver flood events, (2) hydrodynamic model forced by outputs from the statistical model, and finally (3) machine learning (ML) model that uses hydrodynamic outputs from various probability flood events to train the ML algorithm in order to predict the spatially and temporarily variable inundation patterns resulting from a combination of coastal and fluvial flood drivers occurring simultaneously.

The method has been utilized for the case of Cork City, located in the south-west of Ireland, which has a long history of fluvial-coastal flooding. The Lee River channelling through the city centre may generate a substantial flood when the downstream river flow draining to the estuary coincides with the sea water propagating upstream on a flood tide. For this hydrological domain the statistical model employs the univariate extreme values analysis and copula functions to calculate joint probabilities of river discharges and sea water levels (astronomical tides and surge residuals) occurring simultaneously. The return levels for these two components along a return level curve produced by the copula function are used to generate synthetic timeseries, which serve as water level boundary conditions for a hydrodynamic flood model. The multi-scale nested flood model (MSN_Flood) was configured for Cork City at 2m resolution to simulate an unsteady, non-uniform flow in the Lee River and a flood wave propagation over urban floodplains. The ensemble hydrodynamic model outputs are ultimately used to train and test a range machine learning models for prediction of flood extents and water depths. In total, 23 machine learning algorithms including: Artificial Neural Network, Decision Tree, Gaussian Process Regression, Linear Regression, Radial Basis Function, Support Vector Machine, and Support Vector Regression were employed to confirm that the ML algorithm can be used successfully to predict the flood inundation depths over urban floodplains for a given set of compound flood drivers. Here, the limited flood conditioning factors taken into account to analyse floods are the upstream flood hydrographs and downstream sea water level timeseries. To evaluate model performance, different statistical skill scores were computed. Results indicated that in most pixels, the Gaussian Process Regression model performs better than the other models.

The main contribution of this research is to demonstrate the ML models can be used in early warning systems for flood prediction and to give insight into the most suitable models in terms of robustness, accuracy, effectiveness, and speed. The findings demonstrate that ML models do help in flood water propagation mapping and assessment of flood risk under various compound flood scenarios.

How to cite: Olbert, A. I., Moradian, S., and Uddin, G.: Machine learning modelling of compound flood events, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13083, https://doi.org/10.5194/egusphere-egu23-13083, 2023.

EGU23-14126 | ECS | Orals | ITS1.1/NH0.1

ML-based fire spread model and data pipeline optimization

Tobias Bauer, Julia Miller, Julia Gottfriedsen, Christian Mollière, Juan Durillo Barrionuevo, and Nicolay Hammer

Climate change is one of the most pressing challenges to humankind today. The number and severity of wildfires are increasing in many parts of the world, with record-breaking temperatures, prolonged heat waves, and droughts. We can minimize the risks and consequences of these natural disasters by providing accurate and timely wildfire progression predictions through fire spread modeling. Knowing the direction and rate of spread of wildfires over the next hours can help deploy firefighting resources more efficiently and warn nearby populations hours in advance to allow safe evacuation.
Physics-based spread models have proven their applicability on a regional scale but often require detailed spatial input data. Additionally, rendering them in real-time scenarios can be slow and therefore inhibit fast output generation. Deep learning-based models have shown success in specific fire spread scenarios in recent years. But they are limited by their transferability to other regions, explainability, and longer training time. Accurate active fire data products and a fast data pipeline are additional essential requirements of a wildfire spread early-warning system.
In this study, physical models are compared to a deep learning-based CNN approach in terms of computational speed, area accuracy, and spread direction. We use a dataset of the 30 largest wildfires in the US in the year 2021 to evaluate the performance of the model’s predictions.
This work focuses in particular on the optimization of a cloud-based fire spread modeling data pipeline for near-real-time fire progression over the next 2 to 24 hours. We describe our data pipeline, including the collection and pre-processing of ignition points derived from remote sensing-based active fire detections. Furthermore, we use data from SRTM-1 as topography, ESA Land Cover and Corine Land Cover for fuel composition, and ERA-5 Reanalysis products for weather data inputs. The application of the physics-based models is derived from the open-source library ForeFire, to create and execute physical wildfire spread models from single fire ignition points as well as fire fronts. The predictions of the ForeFire model serve as a benchmark for the evaluation of the performance of our Convolutional Neural Network. The CNN forecasts the fire outline based on a spatiotemporal U-Net architecture.
The scaling of the algorithms to a global setting is enabled by the Leibniz Supercomputing Centre. It enables large-scale cloud-based machine learning to provide a time-sensitive solution for operational fire spread modeling in emergency management based on real-time remote sensing information.

How to cite: Bauer, T., Miller, J., Gottfriedsen, J., Mollière, C., Durillo Barrionuevo, J., and Hammer, N.: ML-based fire spread model and data pipeline optimization, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14126, https://doi.org/10.5194/egusphere-egu23-14126, 2023.

EGU23-15711 | Orals | ITS1.1/NH0.1

A globally distributed dataset using generalized DL for rapid landslide mapping on HR satellite imagery

Filippo Catani, Sansar Raj Meena, Lorenzo Nava, Kushanav Bhuyan, Silvia Puliero, Lucas Pedrosa Soares, Helen Cristina Dias, and Mario Floris

Multiple landslide events occur often across the world which have the potential to cause significant harm to both human life and property. Although a substantial amount of research has been conducted to address the mapping of landslides using Earth Observation (EO) data, several gaps and uncertainties remain when developing models to be operational at the global scale. To address this issue, we present the HR-GLDD, a high-resolution (HR) dataset for landslide mapping composed of landslide instances from ten different physiographical regions globally: South and South-East Asia, East Asia, South America, and Central America. The dataset contains five rainfall triggered and five earthquake-triggered multiple landslide events that occurred in varying geomorphological and topographical regions. HR-GLDD is one of the first datasets for landslide detection generated by high-resolution satellite imagery which can be useful for applications in artificial intelligence for landslide segmentation and detection studies. Five state-of-the-art deep learning models were used to test the transferability and robustness of the HR-GLDD. Moreover, two recent landslide events were used for testing the performance and usability of the dataset to comment on the detection of newly occurring significant landslide events. The deep learning models showed similar results for testing the HR-GLDD in individual test sites thereby indicating the robustness of the dataset for such purposes. The HR-GLDD can be accessed open access and it has the potential to calibrate and develop models to produce reliable inventories using high-resolution satellite imagery after the occurrence of new significant landslide events. The HR-GLDD will be updated regularly by integrating data from new landslide events.

How to cite: Catani, F., Meena, S. R., Nava, L., Bhuyan, K., Puliero, S., Pedrosa Soares, L., Dias, H. C., and Floris, M.: A globally distributed dataset using generalized DL for rapid landslide mapping on HR satellite imagery, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15711, https://doi.org/10.5194/egusphere-egu23-15711, 2023.

EGU23-16626 | ECS | Posters on site | ITS1.1/NH0.1

Danish national early warning system for flash floods based on a gradient boosting machine learning framework

Grith Martinsen, Yann Sweeney, Jonas Wied Pedersen, Roxana Alexandru, Sergi Capape, Charlotte Harris, Michael Butts, and Maria Diaz

Fluvial and flash floods can have devastating effects if they occur without warning. In Denmark, management of flood risk and performing preventative emergency service actions has been the sole responsibility of local municipalities. However, motivated by the disastrous 2021 floods in Central Europe, the Danish government has recently appointed the Danish Meteorological Institute (DMI) as the national authority for flood warnings in Denmark, and DMI is in the process of building capacity to fulfill this role.

One of the most cost-effective ways to mitigate flood damages is a well-functioning early warning system. Flood warning systems can rely on various methods ranging from human interpretation of meteorological and hydrological data to advanced hydrological modelling. The aim of this study is to generate short-range streamflow predictions in Danish river systems with lead times of 4-12 hours. To do so, we train and test models with hourly data on 172 catchments.

Machine learning (ML) models have in many cases been shown to outperform traditional hydrological models and offer efficient ways to learn patterns in historical data. Here, we investigate streamflow predictions with LightGBM, which is a gradient boosting framework that employs tree-based ML algorithms and is developed and maintained by Microsoft (Ke et al., 2017). The main argument for choosing a tree-based algorithm is its inherent ability to represent rapid dynamics often observed during flash floods. The main advantages of LightGBM over other tree-based algorithms are efficiency in training and lower memory consumption. We benchmark LightGBM’s performance against persistence, linear regression and various LSTM setups from the Neural Hydrology library (Kratzert et al., 2022).

We evaluate the algorithm trained using different input features. This analysis include model explainability, such as SHAP, and the results indicate that simply using lagged real-time observations of streamflow together with precipitation leads to the best performing and most parsimonious models. The results show that the LightGBM setup outperforms the benchmarks and is able to generate predictions with high Klinge-Gupta Efficiency scores > 0.9 in most catchments. Compared to the persistence benchmark it especially shows strong improvements on peak timing errors.

How to cite: Martinsen, G., Sweeney, Y., Pedersen, J. W., Alexandru, R., Capape, S., Harris, C., Butts, M., and Diaz, M.: Danish national early warning system for flash floods based on a gradient boosting machine learning framework, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16626, https://doi.org/10.5194/egusphere-egu23-16626, 2023.

EGU23-112 | ECS | Orals | NH3.11

A hybrid MPM-CFD model for simulating earthquake-induced submarine landslides

Quoc Anh Tran

EGU23-1131 | ECS | Orals | NH3.11

Landsifier: A python library to estimate likely triggers and types of landslides

Ugur Ozturk, Kamal Rana, Kushanav Bhuyan, and Nishant Malik

The accuracy of landslide hazard models depends on landslide databases for model training and testing. Landslide databases frequently lack information on the underlying triggering mechanism (i.e., earthquake, rainfall), rendering them nearly useless in hazard models.

We created Landsifier, a Python-based unique library with three different machine-Learning frameworks for assessing the likely triggering mechanisms of individual landslides or entire inventories based on landslide 2D platforms and 3D shapes relying on an underlying digital elevation model (DEM). The base method extracts landslide planform properties as a feature space for the shallow learner-random forest (RF). An alternative approach uses 2D landslide images as input for the convolutional neural network deep learning algorithm (CNN). The final framework uses topological data analysis (TDA) to extract features from 3D landslide surfaces, which are then fed into the random forest classifier as a feature space. We tested the developed methods on six inventories spread over Japan. We achieved mean accuracy ranging from 70% to 98%.

Advancing this trigger classifier, we are working on the next generation to classify also the landslide types (i.e., flows, slides, falls, complex) similarly.

How to cite: Ozturk, U., Rana, K., Bhuyan, K., and Malik, N.: Landsifier: A python library to estimate likely triggers and types of landslides, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1131, https://doi.org/10.5194/egusphere-egu23-1131, 2023.

EGU23-1600 | ECS | Orals | NH3.11

Nonsmooth simulations of 3D Drucker-Prager granular flows and validation against experimental column collapses

Gauthier Rousseau, Thibaut Métivet, Hugo Rousseau, Gilles Daviet, and Florence Bertails-Descoubes

Testing advanced numerical hydro-mechanical models against well-controlled experiments is a critical step in improving our understanding of unsteady granular mass flows, and necessary to provide some domains of validity for any further risk assessment.
To this end, experimental granular collapses were performed to evaluate the sand6 numerical simulator introduced by Daviet & Bertails-Descoubes (2016), which represents the granular medium as an inelastic and dilatable continuum subject to the Drucker-Prager yield criterion in the dense regime, and computes its dynamics using a 3D material point method (MPM). A specificity of this numerical model is to solve such the Drucker-Prager nonsmooth rheology without any regularisation, by leveraging tools from nonsmooth optimisation.
This nonsmooth simulator, which relies on a constant friction coefficient, is able to reproduce with high fidelity various experimental granular collapses over inclined erodible beds, provided the friction coefficient is set to the avalanche angle - and not to the stop angle, as generally done. The results, obtained for two different granular materials and for bed inclinations ranging from 0° to 20°, suggest that a simple constant friction rheology choice remains reasonable for capturing a large variety of granular collapses up to aspect ratios in the order of 10.
Investigating the precise role of the frictional walls by performing experimental and simulated collapses with various channel widths, we find out that, unlike some assumptions formerly made in the literature, the channel width has lower influence than expected on the granular flow and deposit.
The constant coefficient model is extended with a hysteresis model, thereby improving the predictions of the early-stage dynamics of the collapse. This illustrates the potential effects of such phenomenology on transient granular flows, paving the way to more elaborate analysis.

How to cite: Rousseau, G., Métivet, T., Rousseau, H., Daviet, G., and Bertails-Descoubes, F.: Nonsmooth simulations of 3D Drucker-Prager granular flows and validation against experimental column collapses, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1600, https://doi.org/10.5194/egusphere-egu23-1600, 2023.

EGU23-1702 | ECS | Orals | NH3.11

Rainfall-induced Landslide temporal probability prediction and meteorological early warning modeling based on LSTM_TCN model

yu Zhao and lixia Chen

Abstract: The occurrence time of investigated landslide hazard is not complete, leading to an error in the statistical relationship between rainfall and landslide. And the low accuracy of the critical rainfall threshold model will be built. And further, it will lead to an increase in the false positive rate of meteorological early warning. This study takes rainfall-induced landslides in the Wanzhou District of Chongqing from 1995 to 2015 as the research object. And Henghe Township, where historical disaster data is missing seriously, is the verification area. This study proposes a prediction model of the daily temporal probability of landslides occurrence on a certain day based on Long Short-Term Memory (LSTM) and Temporal Convolutional Network (TCN). The method is used to reconstruct the temporal information of rainfall-induced landslide events by simulating the nonlinear relationship between the occurrence time of landslides and rainfall. The landslide events after the reconstruction of temporal information were verified and selected, and then applied to the reasonable division of the E-D effective rainfall threshold curve, so as to establish the landslide meteorological warning model. The average temporal probability of rainfall-induced landslide occurrence on a certain day predicted by the proposed method reached 90.33%, which is higher than that of ANN (71.17%), LSTM (72.75%), and TCN (86.91%). Based on the temporal probability of landslide occurrence on a certain day which is higher than the 90% probability threshold, 18-time information including 42 landslides in Henghe Township of the verification area is expanded to 201. Compared with only using the historical landslide events, the meteorological warning model based on the expanded time information has a more reasonable warning classification, and the effective warning rate in the severe warning level is increased by 42.86%. The model method in this study is of constructive significance to the daily temporal probability prediction of rainfall-induced landslides on the regional scale and is helpful for the government to accurately model the risk decision of landslide meteorological warning.

How to cite: Zhao, Y. and Chen, L.: Rainfall-induced Landslide temporal probability prediction and meteorological early warning modeling based on LSTM_TCN model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1702, https://doi.org/10.5194/egusphere-egu23-1702, 2023.

EGU23-2445 | ECS | Orals | NH3.11

Uncertainty research of landslide susceptibility mapping based deep ensemble learning: different basic classifier and ensemble strategy

Taorui Zeng, Kunlong Yin, and Liyang Wu

The Jurassic red-strata of the Three Gorges Reservoir Area in China is interbedded of thick siltstone and thin sandy-mudstone and contains many clay minerals, such as montmorillonite and illite, which is water sensitive, weak and expansive, and easy to decompose by water weathering. In particular, due to the seasonal rainfall, development of settlements, and large-scale reservoir impoundment, many slow-moving landslides (e.g., deep rotation and planar landslides) often occur. Notwithstanding, the reconnaissance, updating, and mapping of kinematic features of township area landslides lack the appropriate attention of the government and researchers. Landslide susceptibility mapping is necessary prerequisites for landslide hazard and risk assessment. But a certain proportion of unpredictability is always closely related to modeling. The main objective of this work is to introduce deep ensemble learning into landslide susceptibility assessment to improve the performance of maximum likelihood models. Therefore, the current model construction has focused on three basic classifiers: decision tree, support vector machine, multi-layer perceptron neural network model, and two homogeneous ensemble models: random forest and extreme gradient boosting. Two prominent ensemble techniques—homogeneous/heterogeneous model ensemble and bagging, boosting, stacking ensemble strategy—were applied to implement the deep ensemble learning. Then, thirteen influencing factors were prepared as predictors and dependent variables. The landslide susceptibility maps were validated by the area under the receiver operating characteristic curve. The results of validation showed that the ensemble model shows that the ROC/AUC value is higher than 0.9, which is improved compared with the basic classifiers. Deep ensemble learning focuses more on detecting the landslide susceptibility area with the highest probability of occurrence. The Stacking based RF-XGBoost model obtained the best verification score (AUC=0.955). The comparison between the susceptibility map and landslide inventory data is encouraging as most of the recorded landslide pixels (about 83.3%) are at a high susceptibility level. Besides, from the information gain rate, we found that the Yangtze River and human engineering activities mainly affect the results, which is consistent with the current situation in the study area. The research results in the township-level landslide susceptibility map can also be extended to other urban and rural areas affected by landslides to reduce the landslide disaster risk and formulate further development strategies.

How to cite: Zeng, T., Yin, K., and Wu, L.: Uncertainty research of landslide susceptibility mapping based deep ensemble learning: different basic classifier and ensemble strategy, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2445, https://doi.org/10.5194/egusphere-egu23-2445, 2023.

EGU23-4428 | Orals | NH3.11

From Dense Flows to Powder Cloud Simulations: The OpenFOAM Avalanche Module

Matthias Rauter, Julia Kowalski, and Wolfgang Fellin

OpenFOAM [1] is a well-known and widely used framework for physical simulations. Its Finite Area Framework allows the depth-integrated simulation of flows on nearly arbitrary surfaces. It was shown that this framework can be applied to snow avalanche simulations in natural terrain [2].

We will present the latest updates to the framework and the implementation of the avalanche module. The module provides not only a model for dense flow avalanches [2], but was lately extended to simulate powder snow avalanches and mixed snow avalanches. Various well-known friction and snow entrainment models are available for use, as well as unique models for deposition and coupling of dense flow and powder cloud layer in mixed snow avalanches. For practical applications, the module provides interfaces and methods for the integration of geographic information systems (GIS) and is fully capable of using raster and shape files for in- and output.

The avalanche module is built to integrate well in the OpenFOAM structure and follows the common user concepts of OpenFOAM. Therefore, users familiar with OpenFOAM should be able to accommodate quickly to the module and to run simulations after a short time. The module is provided as open source and its structure enables and encourages the implementation and experimenting with new ideas. One mayor goal of the module is to reduce the time from model development to model evaluation and application.

The module is hosted and developed collaboratively on develop.openfoam.com/Community/avalanche. We will provide an introduction into the framework and development process and provide interested people pointers on how to get started with the module and how to implement their own ideas.

[1] Weller, H. G., Tabor, G., Jasak, H., & Fureby, C. (1998). A tensorial approach to computational continuum mechanics using object-oriented techniques. Computers in physics, 12(6), 620-631.

[2] Rauter, M., Kofler, A., Huber, A., & Fellin, W. (2018). faSavageHutterFOAM 1.0: depth-integrated simulation of dense snow avalanches on natural terrain with OpenFOAM. Geoscientific Model Development, 11(7), 2923-2939.

How to cite: Rauter, M., Kowalski, J., and Fellin, W.: From Dense Flows to Powder Cloud Simulations: The OpenFOAM Avalanche Module, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4428, https://doi.org/10.5194/egusphere-egu23-4428, 2023.

EGU23-4715 | Posters on site | NH3.11

MultiResUNet, VGG16, and U-Net applications for landslide detection

Saro Lee, Fatemeh Rezaie, and Mahdi Panahi

The frequent occurrence of disastrous landslides can lead to significant infrastructure damages, loss of life, and the relocation of populations. Early detection of landslides is crucial for mitigating the consequences. Today, deep learning algorithms, particularly fully convolutional networks (FCNs) and their variants such as the ResU-Net, have been utilized to rapidly and automatically detecting landslides. In the current study, a novel method using three new deep learning models: MultiResUNet, VGG16, and U-Net was used to detect landslides in Hokkaido Island, Japan. Our dataset is comprised of Sentinel-2 images and a mask layer, which includes "landslide" or "non-landslide" labels. The suggested framework was based on the analysis of satellite images of landslide-prone locations using bands 2 (blue), 3 (green), 4 (red), and 5 (visible and near-infrared) of Sentinel 2, slope and elevation factors. We trained each model on the dataset and evaluated their performance using a variety of statistical indexes, including precision, recall, and F1 score. The results showed that the MultiResUNet model outperformed the other two models, achieving an accuracy of 82.7%. The VGG16 and U-Net models achieved accuracies of 65.5% and 67.2%, respectively. The results indicated the capability of deep learning algorithms to process satellite images for early landslide detection and provide the opportunity of implementing efficient and effective disaster management strategies.

How to cite: Lee, S., Rezaie, F., and Panahi, M.: MultiResUNet, VGG16, and U-Net applications for landslide detection, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4715, https://doi.org/10.5194/egusphere-egu23-4715, 2023.

EGU23-5177 | ECS | Posters virtual | NH3.11

Effects of Spacing Distance between Cylindrical Obstacles on Granular Shock Interactions in Gravity-Driven Experimental Flows

Jian Wang, Zheng Chen, and Dongpo Wang

Gravity-driven geophysical granular flows, such as rock avalanches, landslides, debris flows, etc., interact with obstacles (e.g. bridge piers and buildings) as they flow down the slope, causing rapid changes in flow velocity and height in the vicinity to form a granular shock wave in front of the object. The interaction between shock waves will affect the granular-flow field near the obstacles. However, the complex physical processes make some challenges in understanding how the granular material behaves in the influencing area of shock-shock interaction.

In this study, systematic chute experiments were performed with glass particles to investigate the dynamic interaction between granular flow and two circular cylinders with variable spacing distances. The pressure sensors were used to measure the impact pressure of the granular flow on the upstream cylindrical surfaces and a plate equipped flush with the chute bed. The accelerometers were mounted at the bottom of the plate to record seismic signals generated by the granular flow impacting on the bed as well as the cylinders. Flow velocities and depths were determined using an image processing method. The discrete element method (DEM) was utilized to construct a virtual model of the chute system and particles and to simulate the dynamic processes of granular flow interacting with the cylinders. The experimental and the DEM simulated results showed that bow shock waves were generated just upstream of the two cylinders and a granular vacuum zone was formed on the lee side of each cylinder, with the incoming flow velocity being significantly reduced in the granular-shock influencing area. As the spacing decreases, the two shock waves change from being independent to mutual interference. In addition, the effects of spacing distances on the shapes of the granular vacuum and bow shock waves were investigated by experiments and compared to the DEM results, showing a strong interaction between granular shocks. The pinch-off distance which is determined by the length of the granular vacuum also showed a dependence on the spacing distance of the cylinders, indicating a decreasing pinch-off distance with decreasing value of spacing. The impact pressures and acoustic signals generated by granular flow impacting on the chute bed and the surfaces of the cylinders in the shock influencing area for varying Froude numbers were also analyzed.

In summary, the DEM simulations and the recorded signals are helpful to analyze the interaction between granular shock waves. The finding in present study may contribute to better understanding granular shock dynamics and may eventually in improving the design of the protective structure in hazard-prone area.

How to cite: Wang, J., Chen, Z., and Wang, D.: Effects of Spacing Distance between Cylindrical Obstacles on Granular Shock Interactions in Gravity-Driven Experimental Flows, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5177, https://doi.org/10.5194/egusphere-egu23-5177, 2023.

EGU23-5309 | ECS | Orals | NH3.11

Impacts of flow path water-saturation for debris-flow erosion modelling at Illgraben (Switzerland)

Anna Lena Könz, Jacob Hirschberg, Brian McArdell, Perry Bartelt, and Peter Molnar

Debris flows can significantly grow along their flow path by entraining sediments stored in the channel bed and banks. This entrainment process is influenced by various factors such as flow properties (e.g., flow momentum, basal shear stress) and environmental conditions (e.g., soil water saturation, sediment availability). In recent years, different attempts to include the entrainment process in runout models have improved modelled flow properties and runout behavior by empirically linking entrainment volumes to individual modelled flow properties. Linking entrainment to environmental factors, however, has remained challenging.

Here, we aim at implementing and testing the influence of flow path water-saturated conditions in debris-flow runout modelling in a Swiss debris-flow basin (Illgraben). To this end, the modified RAMMS runout model, which includes an empirical algorithm to describe entrainment as a function of basal shear stress (Frank et al., 2015), is coupled with a simple hydrological model to predict soil water saturation. In a first step, the RAMMS model was calibrated for the Illgraben site for seven events with detailed data on erosion/deposition along the fan as well as flow properties at the outflow of the simulation domain (de Haas et al., 2022). In the calibration procedure, the focus was placed on the erosion proportionality factor dz/dtau [m/kPa] (which links the maximum potential erosion depth to the basal shear stress) as it is assumed to be the driving saturation-induced increase of entrained volume. Preliminary results show that in most cases, including the entrainment process improves the reproduction of the flow properties, especially the ‘hydrograph’ front, and that the erosion proportionality factor dz/dt shows a significant degree of variation for different events. In a second step, the relationship between soil moisture conditions and maximum erosion depth expected along the flow path was investigated. The hydrologic conditions are simulated with a conceptual model solving the water balance for the basin’s headwaters. The headwater discharge serves as the water input for the channel on the fan, where an infiltration model is applied, and entrainment is investigated. The presented framework, which could be incorporated into other runout models, is expected to be useful for debris-flow entrainment modelling, as well as for assessing climate change impacts on debris-flow runout.

References

de Haas, T., McArdell, B.W., Nijland, W., Åberg, A.S., Hirschberg, J., Huguenin, P., 2022. Flow and Bed Conditions Jointly Control Debris‐Flow Erosion and Bulking. Geophysical Research Letters 49. https://doi.org/10.1029/2021GL097611

Frank, F., McArdell, B.W., Huggel, C., Vieli, A., 2015. The importance of entrainment and bulking on debris flow runout modeling: examples from the Swiss Alps. Nat. Hazards Earth Syst. Sci. 15, 2569–2583. https://doi.org/10.5194/nhess-15-2569-2015

How to cite: Könz, A. L., Hirschberg, J., McArdell, B., Bartelt, P., and Molnar, P.: Impacts of flow path water-saturation for debris-flow erosion modelling at Illgraben (Switzerland), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5309, https://doi.org/10.5194/egusphere-egu23-5309, 2023.

EGU23-6411 | ECS | Posters virtual | NH3.11 | Highlight

Importance of water and water producing processes in cascading events in mountainous regions

Jessica Munch and Perry Bartelt

Over the last years, several multiphase avalanches have been observed, some of them leading to a cascade of events, such as in Chamoli, India, 2021, where a mixture of ice and rock fell down Ronti Peak, and transitioned to a debris flow with large amounts of water being involved. Another example is the event that occurred at Pizzo Cengalo, Switzerland, in 2017, where the rock face collapsed on the underlying glacier, entraining part of it, and also transitioning to a debris flow. When such a mass movement occurs, and leads to a cascade of events, the runout distances are much longer, and the consequences, both for humans and infrastructure, are much more important.

When a multiphase avalanche turns into a cascade of events, the amount of water present in the flow seems to be a determining factor for the runout distance. The sources of water, for both of the events aforementioned remain debated, and the amounts of water that can be generated by the melting of the ice in the flow or by entrainment are poorly constrained. Indeed, from the moment that ice and snow are involved in a multi-material gravitational flow, they have the potential to melt due to friction between the different components of the flow and with the ground, and hence generate water. Material entrainment on the way also has the potential to either directly incorporate water in the flow, or bring in material with a high water content (i.e. hydrated sediments) or ice, that has the ability to melt while the flow propagates. An accurate modelling the thermal aspect of the flow as well as its ability to entrain material on the way is necessary to quantify the amount of water present in the flow.

Here, using a multiphase depth-average model specifically designed to handle gravitational flows made of rocks/ice/water/snow or any single components of these, we want to assess 1) the impact of heat transfers between the materials and 2) entrainment of multiphase ground material on the flow behaviour and more specifically on the water content in the flow and the consequences it has in term of runout distances and potential for cascading events.

First results show that both entrainment and heat transfer within the flow play a major role in water production. Our experiments suggest that heat transfer between rocks and ice leads to the most efficient water production. Material entrainment also plays a major role in incorporating water in the flow, or producing it by melting entrained ice. Better constrains regarding material thermal properties, ground composition and potential for entrainment are however necessary to accurately quantify the amounts of water that can join the flow and influence the runout distances.

How to cite: Munch, J. and Bartelt, P.: Importance of water and water producing processes in cascading events in mountainous regions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6411, https://doi.org/10.5194/egusphere-egu23-6411, 2023.

EGU23-6718 | ECS | Posters on site | NH3.11 | Highlight

Generating multi-temporal landslide inventories through a general deep transfer learning strategy using HR EO data

Kushanav Bhuyan, Hakan Tanyas, Lorenzo Nava, Silvia Puliero, Sansar Raj Meena, Mario Floris, Cees van Westen, and Filippo Catani

Mapping landslides in space has gained a lot of attention over the past decade with good results. Current methods are primarily used to generate event inventories, but multi-temporal (MT) inventories are rare, even with manual landslide mapping. Here, we present an innovative deep learning strategy employing transfer learning. This allows our Attention Deep Supervision multi-scale U-Net model to be adapted to landslide detection tasks in new regions. This method also provides the flexibility to retrain a pretrained model to detect both rain and seismic landslides in new regions of interest. For mapping, archived Planet Lab remote sensing imagery from 2009 to 2021 at spatial resolutions of 3–5 m was used to systematically generate MT landslide inventories. Examining all cases, our approach provided an average F1 value of 0.8, indicating that it successfully identified the spatiotemporal occurrence of landslides. To examine the size distribution of mapped landslides, we compared the frequency distribution of predicted co-seismic landslides with manually mapped products from the literature. The results showed good agreement between the calculated exponents of the power law, with differences ranging from 0.04 to 0.21. Overall, this study demonstrated that the proposed algorithm can be applied to large areas to construct a polygon-based MT landslide inventory.

How to cite: Bhuyan, K., Tanyas, H., Nava, L., Puliero, S., Meena, S. R., Floris, M., Westen, C. V., and Catani, F.: Generating multi-temporal landslide inventories through a general deep transfer learning strategy using HR EO data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6718, https://doi.org/10.5194/egusphere-egu23-6718, 2023.

EGU23-6884 | ECS | Orals | NH3.11

Using Deep Learning for Sentinel-1-based Landslide Mapping

Aiym Orynbaikyzy, Frauke Albrecht, Wei Yao, Simon Plank, Andres Camero, and Sandro Martinis

Every year, landslides kill or injure thousands of people worldwide and substantially impact human livelihood. With the increasing number of extreme weather events due to the changing climate, urban sprawl and intensification of human activities, the amount of deadly landslide events is expected to grow. Landslides often occur unexpectedly due to the difficulty of predicting their location and timing. In such cases, providing information on the spatial extent of the landslide hazard is essential for organising and executing first-response actions on the ground.

This study explores the advantages and limitations of using high-resolution Synthetic Aperture Radar (SAR) data from Sentinel-1 within a deep learning framework for rapidly mapping landslide events. The objectives of the research are four-fold: 1) to investigate how Sentinel-1 landslide mapping can be improved using deep learning; 2) to explore if the addition of up to three pre-event scenes could improve the SAR-based classification accuracies; 3) to test if and how much the addition of polarimetric decomposition features and interferometric coherence help to improve classification accuracies; 4) to test if performing data augmentation affects the final results.

We adopt a semantic segmentation model – U-Net, and a novel deep network - U²-Net, to map landslides based on limited but globally distributed landslide inventory data. In total, 306 image patches with 128x128 pixels size were split into 80% for training/validation of the model and 20% for testing it. We calculate radar backscatter information (gamma nought VV and VH), polarimetric decomposition features (alpha angle, entropy, anisotropy) and interferometric coherence between temporally adjacent scenes. The features are calculated for three pre-event scenes and one post-event scene. Copernicus Digital Elevation Model (DEM) data are used to integrate land surface elevation and slope information into the classification process.

Using all Sentinel-1 features, the best result of deep learning model obtained 0.96 for the Dice coefficient on validation data. The landslide detection based on U²-Net gave slightly better results than the U-Net-based approach. The accuracies of models based on one, two or three pre-event scenes did not substantially differ, indicating no added values of increasing pre-event SAR features. Higher accuracies were reached when polarimetric decomposition features were combined with interferometric coherence compared to runs with only radar backscatter. Increasing the sample size using image augmentation methods such as four-directional rotation and flipping helped advance the accuracy.

Future research is directed towards (i) increasing and diversifying the landslide examples, (ii) performing landslide-events-based resampling and (iii) adding pre- and post-event optical data from Sentinel-2.

How to cite: Orynbaikyzy, A., Albrecht, F., Yao, W., Plank, S., Camero, A., and Martinis, S.: Using Deep Learning for Sentinel-1-based Landslide Mapping, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6884, https://doi.org/10.5194/egusphere-egu23-6884, 2023.

EGU23-8446 | ECS | Orals | NH3.11 | Highlight

Automatic detection of landslides from satellite images using a range of training events

Kathryn Leeming, Itahisa Gonzalez Alvarez, Alessandro Novellino, and Sophie Taylor

Landslides in remote or uninhabited regions can be undocumented, leaving gaps in landslide inventories which are a key input for hazard and risk assessments. This can lead to landslide events being missing from research studies, and contribute to a bias in the events used for training of machine learning models.

In this work we use satellite images, terrain information, and labelled examples of landslides to train a convolutional neural network (U-Net), for the purpose of adding previously undocumented and new landslides to inventories. This model segments the input images and highlights the pixels it labels as landslides.

Our work focusses on landslides with a range of types and triggers, so that the model is exposed to a variety of training data. We describe the key properties of the landslides in the training set, and discuss the implications for future uses of the trained model.

How to cite: Leeming, K., Gonzalez Alvarez, I., Novellino, A., and Taylor, S.: Automatic detection of landslides from satellite images using a range of training events, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8446, https://doi.org/10.5194/egusphere-egu23-8446, 2023.

EGU23-8596 | ECS | Posters virtual | NH3.11 | Highlight

Evaluating effects of topographies on explicit hydromechanical solvers using procedural generation

Saoirse Robin Goodwin

A key problem for landslide research is evaluating hydromechanical solvers on a suitable variety of terrain types. There currently exists a large gulf between studies using hydromechanical solvers on highly idealised terrain, and those on real topographies. This makes it difficult to properly evaluate (i) the sensitivity of the output from the solver to specific terrain features, and (ii) potential numerical artifacts. One way to bridge the gap is to use procedural generation -- which has been used extensively in the videogame and animation industries for three decades -- to generate hillsides with controlled properties. Indeed, the size and frequency of topographical features can be set using procedural generation algorithms, so the spatial distribution of topographical features can be varied in isolation. This study uses a depth-averaged SPH solver to model single-surge flows on a variety of procedurally generated terrains. We investigate the effects of the spatial distribution and magnitude of features on the deposition patterns from the flows. We also discuss other potential applications for these approaches, including hazard mapping for cases where topographical uncertainty is likely (e.g. for modelling snow avalanches).

How to cite: Goodwin, S. R.: Evaluating effects of topographies on explicit hydromechanical solvers using procedural generation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8596, https://doi.org/10.5194/egusphere-egu23-8596, 2023.

EGU23-8895 | ECS | Orals | NH3.11

Application of SOSlope to shallow landslide triggering in Rüdlingen (Switzerland)

Ilenia Murgia, Filippo Giadrossich, Denis Cohen, Gian Franco Capra, and Massimiliano Schwarz

The development and application of deterministic models for vegetated slope stability analysis at a local scale is a pivotal issue in international research. Such tools identify mitigation and risk management techniques during increasingly frequent critical rainfall events. In this sense, the SOSlope software, developed by ecorisQ international association (www.ecorisq.org), allows the simulation of hydro-mechanical dynamics that may influence shallow landslides' occurrence, focusing on the progressive activation of root reinforcement in space and time to counteract soil movement.

This study presents a reconstruction of an artificially triggered landslide in Rüdlingen (Switzerland), carried out during the Triggering Rapid Mass Movements project, aiming for a back-analysis of the hydro-mechanical conditions leading to its triggering. This experiment allows comparing real-scale data on triggering dynamics of shallow landslides with modeling assumptions and results. Detailed measurements during the investigation and following slope failure were used to calibrate the hydro-mechanical input parameters used in SOSlope and evaluate the modeling capability to reproduce the landslide-triggering conditions and behaviors.

Results show a reasonable reconstruction of the complex dynamics leading to the loss of soil stability. In particular, considering the water effect and the force redistribution dynamics during the triggering. SOSlope can quantify the effect of the root reinforcement spatial distribution and passive earth pressure. In addition to quantifying the maximum value of root reinforcement achieved to counteract soil movement, SOSlope enables observing its progressive activation in space and time. Pore water pressure dynamics show a distinctive trend regarding preferential flows in soil fractures and macropores; the decrease of suction stress due to increased water content in the soil matrix was also observed. SOSlope allows for systemic analysis of the landslide event by evaluating the different phases of change in slope stability and identifying the causes that favored their failure. These results are challenging to understand the shallow landslide triggering dynamics on vegetated slopes, given simplified assumptions through simpler models. This tool could support risk management strategies, including green-based solutions, nearby structures and infrastructure, or reforestation activities for slope stabilization. In the latter case, through the software, the structure, composition, and efficiency of the plantation can be checked.

Future developments in SOSlope will include the implementation of a triangulated grid mesh to improve computational limitations associated with the raster input data square grid resolution and the inclusion of new tree species for root reinforcement estimation.

How to cite: Murgia, I., Giadrossich, F., Cohen, D., Capra, G. F., and Schwarz, M.: Application of SOSlope to shallow landslide triggering in Rüdlingen (Switzerland), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8895, https://doi.org/10.5194/egusphere-egu23-8895, 2023.

EGU23-9956 | ECS | Orals | NH3.11

A surrogate model for depth-averaged erosion and deposition closures using deep learning

Mohammad Nikooei and Clarence Edward Choi

Geophysical mass flows are commonly modelled using depth-averaged (DA) numerical models, which rely on closure relations to account for erosion and deposition. While erosion and deposition are grain scale phenomena, their physics is overlooked due to simplifications required in DA models. In this study, a framework is proposed to transfer the grain-scale physics of erosion and deposition to the continuum scale of DA models. A long short-term memory (LSTM) neural network is coupled with a DA model to incorporate the grain-scale physics of erosion and deposition. As a surrogate model for the closure relation, the LSTM model is trained using computed results from grain-scale Discrete Element Method (DEM) simulations. The surrogate model is evaluated by studying the deposition of an initially flowing granular mass over slope. The effective flow depth h and DA velocity u calculated by the DA-LSTM model are compared with DEM simulation results. The DA-LSTM model is demonstrated to provide more computational efficiency compared to DEM simulations. The newly proposed surrogate model offers a promising approach to calculating more complex closures using deep learning techniques.

How to cite: Nikooei, M. and Edward Choi, C.: A surrogate model for depth-averaged erosion and deposition closures using deep learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9956, https://doi.org/10.5194/egusphere-egu23-9956, 2023.

EGU23-10159 | ECS | Posters on site | NH3.11

Unravelling the complex dynamic of slow-moving landslides in the Flysch zone region, Lower Austria. A case study of the Hofermühle catchment.

Yenny Alejandra Jiménez Donato, Edoardo Carraro, Philipp Marr, Robert Kanta, and Thomas Glade

Slow-moving landslides are complex processes that represent a significant challenge for landslide dynamic analysis and disaster risk reduction. In some cases, they have been considered as early signals of potential destructive events as they can accelerate under specific climatic conditions, causing significant damage. However, slow-moving landslides have been constantly neglected as the require significant time, human resources, and specific numerical models to assess their non-uniformity. Considering the existing gaps and the lack of data of slow-moving landslides in Austria, a long-term monitoring project has been carried out by the ENGAGE group of the University of Vienna. Several investigation techniques for hydro-geo monitoring have been installed in Lower Austria for multi-temporal landslide investigation in several landslides, using them as living laboratories. Therefore, the present study aims to integrate the valuable hydro-mechanical data to bring light on potential acceleration conditions of slow-moving landslides, frequency and intensity relationships and cascading hazards initiated from within the slow-moving landslide mass.

The geographical and geological conditions of the province of Lower Austria place it as a very susceptible region to the occurrence of landslides. The predominant geology correspond to the units of the Flysch Zone and the Klippen Zone, which are mechanically weak units composed by intercalation of limestones and deeply weathered materials. These conditions, along with the hydrological conditions, land use changes and other anthropogenic impacts contribute to the instability of the region. Consequently, in order to understand landslide processes and mechanisms, we attempt to integrate the hydro-mechanical data compiled from the monitoring sites to model a complex event triggered in 2013, in the Hofermühle catchment, district of Waidhofen an der Ybbs, in order to improve our understanding of landslide conditioning factors and triggering mechanisms of potential cascading hazards in the region.

How to cite: Jiménez Donato, Y. A., Carraro, E., Marr, P., Kanta, R., and Glade, T.: Unravelling the complex dynamic of slow-moving landslides in the Flysch zone region, Lower Austria. A case study of the Hofermühle catchment., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10159, https://doi.org/10.5194/egusphere-egu23-10159, 2023.

EGU23-10269 | Posters on site | NH3.11

Detecting Landslide Affected Areas Using Deep Learning of Bi-Temporal Satellite Imagery Datasets

Fuan Tsai, Elisabeth Dippold, Po-Jui Huang, and Chi-Chuan Lo

Landslide is one of the most frequently occurred and destructive natural hazards in Taiwan and many other places around the world. Using satellite images to help identify landslide affected regions can be an effective and economic alternative comparing to conventional ground-based measures. However, utilizing remotely sensed images for the investigation and analysis of landslides still faces challenges. In a long-term monitoring of landslide affected areas, it is common to observe landslides occur repeatedly at or around the same region, thus requiring change-detection analysis of multi-temporal image datasets to identify this type (repeatedly occurred) landslides, especially to monitor its expansion. In recent years, machine learning techniques are extensively adopted for image analysis, including satellite images. Therefore, integrating change-detection with machine learning algorithms should be helpful for identifying and mapping incremental landslides from multi-temporal satellite images. This research developed a systematic deep learning framework for detecting landslides with bi-temporal satellite image pairs as the training datasets. The training datasets are extracted and labelled from multi-temporal high-resolution multi-spectral satellite images covering two watershed regions where landslides occurred frequently. Experimental results indicate that the developed machine learning algorithms can achieve high accuracies and perform better than conventional methods for detecting landslide affected areas from time-series satellite images, especially in the places where landslides may occur repeatedly.

How to cite: Tsai, F., Dippold, E., Huang, P.-J., and Lo, C.-C.: Detecting Landslide Affected Areas Using Deep Learning of Bi-Temporal Satellite Imagery Datasets, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10269, https://doi.org/10.5194/egusphere-egu23-10269, 2023.

EGU23-11135 | ECS | Posters virtual | NH3.11

Soil depth Prediction in a landslide prone tropical river basin under data-sparse conditions using machine-learning technique

Achu Asokan Laila and Girish Gopinath

Western Ghats (WG) of India is experiencing frequent landslides during every Indian summer monsoon. Due to the unique blend of topography and tropical humid climate, accelerates chemical weathering, forming a layer of unconsolidated soil unconformably overlies the Precambrian crystalline rock. Lack of cohesion or bonding in these contrasting geologic materials, makes WG vulnerable to various forms of landslides during the peak of Indian summer monsoon. Hence detailed information about soil thickness has a predominant role in identifying the landslide prone area and understanding the landslides in WG. However, soil thickness maps are not available for WG area and steep rugged terrain makes it difficult to collect detailed soil thickness data. This study used a random forest (RF) machine-learning model to predict the soil depth with a limited number of sparse samples in the Panniar river basin of WG. The model was combined using 70 soil depth observations with eleven covariates such as normalized difference vegetation index, topographic wetness index, valley depth, solar radiance, elevation, slope length, slope angle, slope aspect, convergence index, profile curvature and plan curvature. The results show that the RF model has good predictive accuracy with coefficient of determination (R²) of 0.822 and root mean square error (RMSE) of 2.968, i.e., almost 80% of soil depth variation explained. The spatially predicted soil depth map clearly shows regional patterns with local details. Both geomorphological processes and vegetation contributed to shaping the soil depth in the study area. The resulting map can be used for understating the soil characteristics and modelling the landslide susceptibility in the study area.

How to cite: Asokan Laila, A. and Gopinath, G.: Soil depth Prediction in a landslide prone tropical river basin under data-sparse conditions using machine-learning technique, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11135, https://doi.org/10.5194/egusphere-egu23-11135, 2023.

EGU23-13292 | Orals | NH3.11

Advances in landslide analysis by using remote sensing and artificial intelligence (AI): Results from MultiSat4SLOWS project

Mahdi Motagh, Simon Plank, Wandi Wang, Aiym Orynbaikyzy, Magdalena Vassileva, and Mike Sips

Landslides are a major type of natural hazard that cause significant human and economic losses in mountainous regions worldwide. Optical and synthetic aperture radar (SAR) satellite data are increasingly being used to support landslide investigation due to their multi-spectral and textural characteristics, multi-temporal revisit rates, and large area coverage. Understanding landslide occurrence, kinematics and correlation to external triggering factors is essential for landslide hazard assessment. Landslides are usually triggered by rainfall and thus, are often covered by clouds, which limits the use of optical images only. Exploiting SAR data, and their cloud penetration and all weather measurement capability, provides more precise temporal characterization of landslide kinematics and its occurrence. However, except for a few research studies, the full potential of SAR data for operational landslide analysis are not fully exploited yet. This is a very demanding task, considering the availability of a vast amount of Sentinel-1 data that have been globally available since October 2014.

In this presentation we summarise all the achievements that were made within the framework of MultiSat4SLOWS project (Multi-Satellite imaging for Space-based Landslide Occurrence and Warning Service), financed within the Helmholtz Imaging 2020 call. The project aims on developing a multi-sensor approach for detection and analysis of the landslide occurrence time and its spatial extent using freely available SAR data from Sentinel-1. Within this project, we generated a reference database based on Sentinel-1 and -2 data for training, testing and validation of deep learning algorithms. The reference database contains various landslide examples that occurred worldwide and include pre- and post-event polarimetric, coherence and backscatter features. Also, we investigated the applicability of SAR/InSAR time-series data for landslide time detection. Finally, we introduce a prototype of a Visual Analytics platform for rapid landslide analysis of spatial and temporal ground deformation patterns and correlation with external triggering factors.

How to cite: Motagh, M., Plank, S., Wang, W., Orynbaikyzy, A., Vassileva, M., and Sips, M.: Advances in landslide analysis by using remote sensing and artificial intelligence (AI): Results from MultiSat4SLOWS project, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13292, https://doi.org/10.5194/egusphere-egu23-13292, 2023.

EGU23-13333 | ECS | Orals | NH3.11

Geophysical mass flow over complex micro-topography: from grain-scale mechanics to continuum modeling

Lu Jing, Shuocheng Yang, and Fiona C. Y. Kwok

Geophysical mass flows involve granular earth materials surging down natural slopes, one of the major threats to mountainous regions worldwide. Accurate modeling of geophysical mass flows requires closure relations both within the flow (rheology) and at the flow-substrate interface (boundary conditions). However, although recent years have seen significant advances in the modeling of granular flow rheology, our understanding of how flowing granular materials interact with the substrate remains largely elusive. Here, we focus on micro-topography, i.e., geometric base roughness that is about the same size as the grain size, and investigate its effects on the granular flow dynamics as well as the associated closure relations. To systematically vary the base roughness from smooth to rough, we generate the base using immobile particles with varying particle size and spatial arrangement in laboratory experiments (with particle image velocimetry for flow kinematics extraction) and discrete element method simulations. Two granular flow scenarios are considered, including steady-state flow down inclines and granular column collapse. In the first scenario, it is found that basal slip occurs when the base roughness is below a range of intermediate values and a general slip law connecting the slip velocity, the mean flow velocity, and the base roughness is developed. In the second, transient flow scenario, basal slip inevitably occurs even for very rough bases due to inertial effects and a transient basal slip law is proposed to correlate the slip velocity with local flow properties based on kinetic theory arguments. The basal slip laws developed in this work can be readily incorporated as a dynamic boundary condition in continuum modeling of granular flows. In future work, grain-scale mechanisms relevant to more realistic geophysical flows will be investigated, including the feedback effects of pore fluid pressure on the flow mobility during basal sliding and the role of irregular particle shapes in hydro-mechanical modeling of geophysical mass flows.

How to cite: Jing, L., Yang, S., and Kwok, F. C. Y.: Geophysical mass flow over complex micro-topography: from grain-scale mechanics to continuum modeling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13333, https://doi.org/10.5194/egusphere-egu23-13333, 2023.

EGU23-13523 | ECS | Posters on site | NH3.11

Automatic landslide detection using Sentinel-1 and -2 images - a glacial case study

Alexandra Jarna Ganerød, Erin Lindsay, Ola Fredin, Tor-Andre Myrvoll, Steinar Nordal, Martina Calovi, and Jan Ketil Rød

Although Norway is a country with rough terrain and a high frequency instable steep slopes, there is a scarcity of landslide data available. This limits the accuracy of thresholds for early warning systems, and hazard maps, both of which rely on historic event data. There is great potential to supplement existing ground-based observations with automated landslide detection, using satellite imagery and deep learning. In working towards an automated system for landslide detection in Norway, we investigated which imagery types and machine-learning models performed best for detecting landslides in a formerly glaciated landscape.

We locally trained a deep learning model with the use of Keras, TensorFlow 2 and U-net architecture. As input data, we used multi temporal composites with Sentinel-1 and -2 image stacks of all available images from one month pre- and post-event. Processed bands included: dNDVI (difference in maximum normalised difference vegetation index) from Sentinel-2, and pre- and post-event Synthetic Aperture Radar (SAR) data (terrain-corrected, mean of multi-temporal ascending descending images, in VV polarisation) from Sentinel-1. Training and evaluation were performed with a well-verified landslide inventory of 120 manually mapped rainfall-triggered landslides from Jølster (30-July-2019), in Western Norway. We tested the model with four input data settings using different bands and various polarization for the pre- and post-event SAR data, including: 1) full version (all 13 bands) 2) dNDVI (Sentinel-2), preVV, postVV (Sentinel-1), 3) preVV, postVV (Sentinel-1), and 4) post-R, post-G, post-B, post-NIR, dNDVI (Sentinel-2). The results were compared to the results of a pixel-based conventional machine learning model (Classification and Regression Tree) using the same input data. The second input data setting provides the best results. The performance scores show precision results for all four input data settings between 80-85%, with Matthews corelation coefficient values from 51-89%. Moreover, the deep-learning model significantly outperforms the conventional machine learning model in the input data setting #3. We see that the patch-based classification method far out-performs the pixel-classification due to the ability to differentiate the landslide signal from random noise produced from speckle in undisturbed areas. In addition, this represents one of the first attempts to fuse SAR and optical data for landslide detection, and we show there is an advantage in doing so in this case.

How to cite: Ganerød, A. J., Lindsay, E., Fredin, O., Myrvoll, T.-A., Nordal, S., Calovi, M., and Rød, J. K.: Automatic landslide detection using Sentinel-1 and -2 images - a glacial case study, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13523, https://doi.org/10.5194/egusphere-egu23-13523, 2023.

EGU23-14147 | ECS | Orals | NH3.11

Modelling grain size segregation in geophysical mass flows: bridging particle-level forces and continuum models

Meng Liu and Lu Jing

Geophysical mass flows typically consist of a granular solid phase having a broad grain size distribution and an interstitial fluid phase. During the flow, particles of larger sizes tend to segregate in the flow and thereby accumulate in the flow surface and front, resulting in dramatic changes in the flow and deposition characteristics, such as enhanced runout distances and stratified deposit patterns. However, current hydro-mechanical modeling of geophysical mass flows often does not consider grain size segregation and the resulting internal heterogeneity of the flow, which can largely compromise the predictability of existing hydro-mechanical models. A major challenge lies in the multiscale nature of grain segregation and its effects on the flow mobility, which requires detailed characterization of segregation mechanics at both the particle and flow levels. Here, we first review recent advances in a multiscale framework in which the driving and resistive forces of segregation on a single intruder particle or a collection of large particles have been formulated based on discrete element method simulations and theoretical analysis. Then, we discuss how these particle-scale forces can be derived toward a continuum formulation for segregation flux modeling and be connected with the flow dynamics in a two-way coupling manner. These physics-based force formulations reflect the micromechanics of segregation and lead to enhanced predictive modeling of particle size dynamics in the granular flow. Finally, we discuss the potential of extending the proposed framework to consider the effects of interstitial fluids and other mechanisms in upscaled hydro-mechanical modelling for more realistic geophysical mass flows.

How to cite: Liu, M. and Jing, L.: Modelling grain size segregation in geophysical mass flows: bridging particle-level forces and continuum models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14147, https://doi.org/10.5194/egusphere-egu23-14147, 2023.

EGU23-14199 | Orals | NH3.11

“Fusion network with attention for landslide detection. Application to Bijie landslide open dataset”

Candide Lissak, Thomas Corpetti, and Mathilde Letard

Remote sensing techniques are now widely spread for the early detection of ground deformation, implementation of warning systems in case of imminent landslide triggering, and medium- and long-term slope instability monitoring. The large breadth of data available to the scientific community, associated with processing techniques improved as the data volume was increasing, has led to noticeable developments in the field of remote sensing data processing, using machine learning algorithms and more particularly deep neural networks.

This arsenal of data and techniques is necessary for the present scientific challenges the community of researchers on landslides still have to meet. As landslides can be complex, for risk management and disaster mitigation strategies, it is necessary to have a precise idea of their location, shape, and size to be studied and monitored. The challenge aims to automate landslide detection and mapping, especially through learning methods. Machine learning methods based on Deep Neural Networks have recently been employed for landslide studies and provide promising efficient results for landslide detection [1].

In this study, we propose an original neural network for landslide detection. More precisely, we exploit a fusion network [1] dealing with optical images on the one hand and Digital Elevation Models on the other hand. To improve the results, attention layers [3] (able to stabilize the training and more precise results) as well as mix up techniques [4] (able to generalize more efficiently) are exploited.

The model was trained and tested on the open Bijie landslide dataset.

Keywords: Remote sensing for landslide monitoring and detection, landslide detection, deep neural networks, attention

[1] Ji, S., Yu, D., Shen, C., Li, W., & Xu, Q. (2020). Landslide detection from an open satellite imagery and digital elevation model dataset using attention-boosted convolutional neural networks. Landslides, 17(6), 1337-1352.

[2] Song, W., Li, S., Fang, L., & Lu, T. (2018). Hyperspectral image classification with deep feature fusion network. IEEE Transactions on Geoscience and Remote Sensing, 56(6), 3173-3184.

[3] Niu, Z., Zhong, G., & Yu, H. (2021). A review on the attention mechanism of deep learning. Neurocomputing, 452, 48-62.

[4] Thulasidasan, S., Chennupati, G., Bilmes, J. A., Bhattacharya, T., & Michalak, S. (2019). On mixup training: Improved calibration and predictive uncertainty for deep neural networks. Advances in Neural Information Processing Systems, 32.

How to cite: Lissak, C., Corpetti, T., and Letard, M.: “Fusion network with attention for landslide detection. Application to Bijie landslide open dataset”, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14199, https://doi.org/10.5194/egusphere-egu23-14199, 2023.

EGU23-14546 | ECS | Orals | NH3.11

ML-based characterization of PS-InSAR multi-mission point clouds for ground deformation classification

Claudia Masciulli, Michele Gaeta, Giorgia Berardo, Gianmarco Pantozzi, Carlo Alberto Stefanini, and Paolo Mazzanti

Persistent Scatterer Interferometry (PSI) is a powerful multitemporal A-DInSAR (Advanced Differential Synthetic Aperture Radar Interferometry) technique widely used for monitoring and measuring Earth’s surface displacements over large areas with sub-centimetric precision. The capability to detect ground deformation processes relies on the available PSI spatial density, strictly related to the resolution of the considered sensor and the presence of stable natural and artificial reflectors. A new data fusion approach, developed as part of the “MUSAR” project funded by ASI (Italian Space Agency), integrates multi-band SAR sensors to improve data coverage of PSI data by synthesizing multi-sensor displacement information. The integration of multi-mission PSI generates synthetic measurement points, named Ground Deformation Markers (GD-Markers), featuring vertical (Up-Down) and horizontal (Est-West) components of the displacements. The fusion of PSI data extracted by C-band Sentinel-1 images from the Copernicus initiative and the COSMO-SkyMed constellation in the X-band from ASI contributed to creating a dataset with high information content.

Each GD-Markers cluster with displacement measurements identifies a specific deformation process in the region of interest. After selecting the relevant cluster of points, the deformation processes were classified into different categories (e.g., landslide, subsidence) to improve their understanding and evaluation for mitigating natural-related hazards. This study aimed to develop a machine learning-based classification system, starting from GD-Markers point clouds, which support the automatization of ground displacement identification and characterization. The synthetic points were characterized as individual entities or point clouds, formed by a discrete cluster of points in space, to evaluate the advantage of treating each point independently or incorporating local neighborhood information. The structured point data were analyzed using a supervised Random Forest (RF) approach to evaluate the performance of point cloud classification and categorization for identifying the best initial setting. Each point was assigned a label representing a deformation process in point cloud classification, while one label is provided for the entire point cloud dataset with categorization.

Comparing models’ performances allowed the definition of the best possible approach for classifying the deformation processes observed by GD-Markers point clouds. The analysis assessed the effectiveness of the classification of single points or clusters to identify the optimal setup that achieves an accurate segmentation between adjacent deformation processes. Identifying this initial setting was essential for selecting and developing advanced deep-learning approaches.

How to cite: Masciulli, C., Gaeta, M., Berardo, G., Pantozzi, G., Stefanini, C. A., and Mazzanti, P.: ML-based characterization of PS-InSAR multi-mission point clouds for ground deformation classification, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14546, https://doi.org/10.5194/egusphere-egu23-14546, 2023.

EGU23-14639 | ECS | Posters on site | NH3.11

Performance analysis of a U-Net landslide detection model

Itahisa Gonzalez Alvarez, Kathryn Leeming, Alessandro Novellino, and Sophie Taylor

Image segmentation algorithms are a type of image classifier that assigns a label to each individual pixel in an image. U-Nets, initially developed for the analysis of biomedical images and now widely used in a variety of fields, are an example of such algorithms. It has been shown that U-Nets are specially interesting when working with small training datasets and combined with data augmentation techniques.

In this study, we used satellite images with labelled landslide masks from known events to train a U-Net to identify areas of potential landslide. These landslide masks are time-consuming to create, resulting in a small initial training set. Even when working with U-Nets, the success of machine learning and AI tools depends on the availability and quality of training data, as well as the algorithm settings during the training process. Tuning machine learning models to achieve the best performance possible from limited amounts of data is important to generate trustworthy results that can be used to advance the knowledge of landslide events around the world.

Here, we show the differences in algorithm performance as we use different types of data augmentation and model parameters. We also explore and assess the effects on performance of options such as including different satellite bands, terrain information and alternative colour band representations.

How to cite: Gonzalez Alvarez, I., Leeming, K., Novellino, A., and Taylor, S.: Performance analysis of a U-Net landslide detection model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14639, https://doi.org/10.5194/egusphere-egu23-14639, 2023.

EGU23-15954 | ECS | Orals | NH3.11 | Highlight

Automatic mapping of multitemporal landslide inventories by using open-access Synthetic Aperture Radar and NDVI imagery in Google Earth Engine: a case study of the “Via al Llano” highway (Colombia)

Laura Paola Calderon-Cucunuba and Christian Conoscenti

Steep slopes, deforestation, unconsolidated deposits, high annual rainfall, and a highly dissected landscape facilitate the occurrence of landslides in one of the most important Colombian highways “Via al Llano”, frequently causing traffic interruptions. Prior to a susceptibility assessment of the area, a multitemporal inventory is required. Usually, landslides are identified and mapped by visual interpretation of satellite optical and/or aerial images. However, in study areas located in tropical areas such as that of Via al Llano, due to the frequent presence of clouds, a number of images are needed to identify the landslides and estimate the period of their occurrence. Therefore, an automatic detection procedure is indispensable for large tropical areas and multitemporal event inventories. The cloud-based Google Earth Engine (GEE) allows geospatial processing of freely available multi-temporal data. In this work, we perform automatic detection of landslides using the Normalized Difference Vegetation Index (NDVI) from Sentinel-2 (optical images) and the SAR-backscatter change from Sentinel-1 (radar images) over a sector of the Buenavista area, extending for 53km² in the south portion of the “Via al Llano”. Considering a period during which the occurrence of some landslides blocked the highway, images before and after this event were selected for automatic detection, and the results were compared with landslide inventory previously prepared by an expert operator by visual analysis of images available on Google Earth (optical-natural color images). To assess the ability of each method to discriminate between landslides and stable slopes, confusion matrices were calculated. The NDVI-based approach demonstrated an acceptable ability to identify the landslides, although generating a high number of false positives. On the other hand, the SAR-based method exhibits a lower ability to correctly detect the landslide polygons, even if generating a lower number of false positives. This is maybe due to the pattern of predicted positives which mostly consists of isolated pixels; conversely, the NDVI-based approach provides groups of adjacent pixels predicted as positives which better reproduce the shapes of the landslide polygons. Finally, by combining the two approaches and using topographic masks, better accuracy in the automatic mapping of our multitemporal landslide inventories was achieved.

How to cite: Calderon-Cucunuba, L. P. and Conoscenti, C.: Automatic mapping of multitemporal landslide inventories by using open-access Synthetic Aperture Radar and NDVI imagery in Google Earth Engine: a case study of the “Via al Llano” highway (Colombia), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15954, https://doi.org/10.5194/egusphere-egu23-15954, 2023.

EGU23-16166 | Posters on site | NH3.11 | Highlight

Numerical modelling of mudflows impacting settlements: a case study

Alessandro Leonardi, Giulia La Porta, and Marina Pirulli

Mudflows are common natural hazards, often originating from the liquefaction of shallow landslides triggered by rainfall. The numerical back-analysis of past events is key in projecting the application of numerical models towards forward analysis. However, the complex multi-physics nature of the problem hampers the development of comprehensive frameworks. Notwithstanding, calibrated numerical models, able to simulate all aspects of the problem (triggering and runout) can still be valuable tools for aiding the design of countermeasures. This can currently only happen if calibration is performed on the specific site, or on sites with very similar geomorphological and geological characteristics.

In this presentation, the application of a coupled triggering and runout model is explored. Two study cases of well-known events occurring in Southern Italy are presented. A pseudo-plastic model is used for the post-triggering rheology. The resolution of the runout simulation is down to the level of the specific exposed element (houses, roads). This allows for an ad-hoc assessment of risk on key pieces of infrastructure. The results reveal interesting aspects related to how the complex topographic features of settlements challenge the traditional workflow for back-analysis. In particular, the channelization of flows within the settlement itself leads to an overestimation of hazard, unless care is placed to resolve the triggering phase down to the sub-basin scale.

REFERENCES

Ng, C. W. W., Leonardi, A., Majeed, U., Pirulli, M., & Choi, C. E. (2023). A Physical and Numerical Investigation of Flow–Barrier Interaction for the Design of a Multiple-Barrier System. Journal of Geotechnical and Geoenvironmental Engineering, 149(1). https://doi.org/10.1061/(asce)gt.1943-5606.0002932

Pasqua, A., Leonardi, A., & Pirulli, M. (2022). Coupling Depth-Averaged and 3D numerical models for the simulation of granular flows. Computers and Geotechnics, January, 104879. https://doi.org/10.1016/j.compgeo.2022.104879

How to cite: Leonardi, A., La Porta, G., and Pirulli, M.: Numerical modelling of mudflows impacting settlements: a case study, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16166, https://doi.org/10.5194/egusphere-egu23-16166, 2023.

EGU23-16501 | Posters on site | NH3.11

Assessment of landslide susceptibility in the rocky coast subsystem of Essaouira, Morocco

Sergio C. Oliveira, Abdellah Khouz, Jorge Trindade, Fatima ElBchari, Blaid Bougadir, Ricardo A. C. Garcia, and Mourad Jadoud

Several researchers have developed landslide susceptibility maps in recent years using a variety of methods and models. The Information Value method has frequently been used to assess landslide susceptibility in a variety of coastal environments. In this study we used these bivariate statistical techniques to assess the coastal region of Essaouira's susceptibility to landslides. 588 different landslides were found, classified, and mapped along the rocky coast of this coastal stretch. The observation and interpretation of many data sources, such as high-resolution satellite images, aerial photographs, topographic maps, and extensive field surveys, are employed to understand terrain predisposing conditions and to predict landslides. Essaouira's rocky coastal system is situated in the centre of Morocco's Atlantic coast. The study region was divided into 1534 (50 m wide) cliff terrain units. The landslide inventory was randomly split into two separate groups for training and validation purposes: 70% of the landslides were used for training the susceptibility model and 30% for independent validation. Elevation, slope angle, slope aspect, plan curvature, profile curvature, cliff height, topographic wetness index, topographic position index, slope over area ratio, solar radiation, presence of faulting, lithological units, toe lithology, presence and type of cliff toe protection, layer tilt, rainfall, streams, land-use patterns, normalized difference vegetation index, and lithological material granulometry were the twenty-two layers of landslide conditioning factors that were prepared. Using a pixel-based model (12.5 m x 12.5 m) and an elementary terrain unit-based model, the bivariate Information Value approach was used to determine the statistical link between the conditioning factors and the various landslide types and to produce the coastal landside susceptibility maps. The multiple coastal landslide susceptibility models were evaluated for accuracy and predictive power using the receiver operating characteristic curve and area under the curve. The findings allowed for the designation of 38% of the rocky coast subsystem as having a high susceptibility to landslides, with the majority of these areas being found in the southern part of the coastal region of Essaouira. Both future planned development operations and environmental conservation can benefit from these susceptibility maps.

Acknowledgements: The work has been financed by national funds through FCT (Foundation for Science and Technology, I. P.), in the framework of the project “HighWaters – Assessing sea level rise exposure and social vulnerability scenarios for sustainable land use planning” (EXPL/GES-AMB/1246/2021).

How to cite: Oliveira, S. C., Khouz, A., Trindade, J., ElBchari, F., Bougadir, B., Garcia, R. A. C., and Jadoud, M.: Assessment of landslide susceptibility in the rocky coast subsystem of Essaouira, Morocco, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16501, https://doi.org/10.5194/egusphere-egu23-16501, 2023.

EGU23-17563 | Orals | NH3.11 | Highlight

Impact of a debris flow surge on a vertical wall oblique with respect to flow direction

Aronne Armanini, Alessia Fontanari, and Fabio Sartori

Debris flows are rapid to very rapid flows, made up of a high concentrated mixture of water and sediments. These types of flow are catastrophic natural phenomena affecting mountain areas and causing several property damages and loss of lives every year. The mitigation of these phenomena is then fundamental: check dams and longitudinal protection walls are among the main structural passive countermeasures. A crucial aspect in the definition of the design criteria for these structures is the analysis of the impact force exerted by a debris flow on them.
From a scientific point of view, the state of the art in this field is quite lacking, despite the relevance of the topic. In the case of impact of a debris surge on a vertical plane normal to the flow direction, according to Armanini and Scotton (1992), two main types of impact may occur. The first type consists of a complete deviation of the flow along the vertical obstacle, assuming a jet-like behavior (Figure 1). The second type is characterized by the formation of a reflected wave after the impact, which propagates upstream (Figure 2). The analytical solution based on momentum and mass balances in both case is already known (see Armanini 2009 and Armanini et al. 2020) and the comparison between theoretical results and experimental data are quite satisfactory.
Much less studied is the case of the impact of a debris flow surge on a vertical wall, arranged in an oblique direction with respect to the flow direction, as in the case of lateral protection walls.
In order to better understand its kinematic characteristics, the phenomenon has been studied in the Hydraulic Laboratory of the University of Trento. The phenomenon has been reproduced in a channel of variable slope, by releasing a certain volume of fluid and measuring its impact force on a gate situated at the end of the channel at different oblique orientation with respect to flow direction. Several slopes of the channel and concentration of the solid fraction have been investigated.
When the flow crash into the gate, it is deviated in the vertical direction along the obstacle and forms initially a vertical jet, which is soon deviated in the direction parallel to the gate.
The phenomenon has been theoretically investigated both in the light of the one-dimensional theory of fluid impacts already adopted for the case of impact on a vertical wall arranged orthogonally to the flow, and using a simplified approach derived from the classical two-dimensional theory of Ippen (1951) of the deviations of supercritical currents. The comparison between the predictions of the theory and the experimental data turns out to be quite good.

How to cite: Armanini, A., Fontanari, A., and Sartori, F.: Impact of a debris flow surge on a vertical wall oblique with respect to flow direction, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17563, https://doi.org/10.5194/egusphere-egu23-17563, 2023.

EGU23-295 | ECS | Posters on site | GM3.3

GIS-FSLAM-FORM: A QGIS plugin for fa t probabilistic susceptibility assessment of rainfall-induced landslides at regional scale

Hongzhi Cui, Marcel Hürlimann, Vicente Medina, and Jian Ji

Landslide susceptibility analysis is the necessary procedure for timely discovering and locking potential sources of slope instabilities in natural terrain areas. The infinite slope model is broadly applied for evaluating the shallow landslide susceptibility coupling the geotechnical and geological parameters with a hydrological model. Because rainfall is one of the major factors inducing landslides, the calculation of the water table and pore water pressure is an important task in our approach. To assess appropriately the most susceptible areas, we propose a new framework for regional slope stability based on probabilistic analysis by combining a hydromechanical model, which couples the Fast Shallow Landslide Assessment Model (FSLAM) and reliability method. A user-friendly software based on the open-source geographic information system (QGIS) platform called the GIS-FSLAM-FORM plugin adopting the Python programming language was designed and developed. Accounting for the potential uncertainties of geotechnical parameters (in particular effective cohesion and friction of soil or root strength), the horizontal hydraulic conductivity, as well as the soil depth. Our now approach is emphasized for its simple hydrologic model and its high computation efficiency. To consider the probabilistic information of the FSLAM incorporating the infinite slope, the first-order reliability method (FORM) is presented during the analysis although inevitably involving iterative computing. The developed plugin using physically-based modelling can directly provide several regional hazard index distribution maps, such as the factor of safety (FoS), reliability index (RI), and failure probability (P_f).

How to cite: Cui, H., Hürlimann, M., Medina, V., and Ji, J.: GIS-FSLAM-FORM: A QGIS plugin for fa t probabilistic susceptibility assessment of rainfall-induced landslides at regional scale, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-295, https://doi.org/10.5194/egusphere-egu23-295, 2023.

EGU23-733 | ECS | Orals | GM3.3

Landslide Susceptibility Model based on Random Forest classification

Flavius Sirbu

Random Forest (RF) is a classification algorithm used successfully in geomorphological and hazard mapping (Sîrbu et al., 2019). It performs a defined number of classifications, based on decision trees, on random samples with replacement, from the original training data. Because of this, the algorithm is especially robust for errors and outliers in the training data and it is also very good in producing uncertainty estimates for the variability of results on each of the classified features. Its resulting data can also be used, with different methods, to produce a ranking of the independent variables used in the classification.

The present study was performed on a given data set, in central Italy, containing 7,360 slope units covering an area of 4,095 km². The slope units are classified twice, based on different methodologies, into units with or without landslides. Also each slope unit has assigned 26 attributes that were used as independent variables (Alvioli et al., 2022). The slope units are treated as spatially independent from each other, and have been randomly split 70%-30%, into training and validation data respectively.

The model was setup as a computer code, in the R software environment. It uses different libraries to integrate the input data, run the algorithm, run a validation and measure the performance of the model and finally produce the output data. Most of the model settings were used with their default value, with the number of classification trees (ntree) being the only important setting that was fine tuned to a value of 1501 based on different model runs.

The results of the two classifications (one for each classification of the dependent variable) are relatively similar, proving once again the robustness of the RF algorithm when it comes to minor to medium changes in the input data. The first classification had an AUC (area under the curve) value of 0.829 compared with the AUC value of 0.817 for the second classification. For each classification, a ranking of the independent variables was produce, with the standard deviation of slope being the most important predictor. Other predictors with relative high importance were elevation and curvatures.

The results show that RF is an important classifier, which can be used with relatively low custom settings and on almost any data set in order to produce a reliable susceptibility map. Its integration with the R software makes it easy to run the whole process virtually automatic. The computer code for the model will be made freely available.

How to cite: Sirbu, F.: Landslide Susceptibility Model based on Random Forest classification, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-733, https://doi.org/10.5194/egusphere-egu23-733, 2023.

EGU23-2283 | ECS | Orals | GM3.3

Application of the LAND-SUITE software with a benchmark dataset for landslide susceptibility zonation

Txomin Bornaetxea, Mina Yazdani, and Mauro Rossi

We propose the usage of LAND-SUITE software to carry out 16 landslide susceptibility models exploiting the benchmark dataset provided by the session organizers. The software allows the application of Linear Discriminat Analysis (LDA), Logistic Regression (LR) and Quadratic Discriminant Analysis (QDA) as statistical methods, together with the Combination Forecast Model (CFM), which combines the outputs of the former three methods. Each of the mentioned models has been applied considering the two provided different landslide presence variables (presence1 and presence2), resulting in 8 susceptibility maps that takes into account the complete set of explanatory variables. Then, we have taken advantage of the variables analysis outputs provided by LAND-SUITE, and the process has been repeated with a reduced set of 10 explanatory variable. The variables selection has been carried out following the principles of independence between the explanatory variables, and trying to optimize the contribution of each of them to the model performance, for which leave-one-out tests and significance p-value of the LR outputs have been consulted. Results show a slight, but generalized, improvement of the model performances when the presence2 dataset is used, against the presence1. The model performance is also maintained or very sensitively decreased when the amount of explanatory variables is reduced from 26 to 10. However, the Area Under the ROC Curve (AUC) ranges between 0.75 and 0.82 in any of the tests. In addition, 9 out of the 10 selected variables are the same for both presence1 and presence2 tests. Uncertainty associated to each of the models has been also computed by means of the bootstrap resampling method.

How to cite: Bornaetxea, T., Yazdani, M., and Rossi, M.: Application of the LAND-SUITE software with a benchmark dataset for landslide susceptibility zonation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2283, https://doi.org/10.5194/egusphere-egu23-2283, 2023.

EGU23-3566 | ECS | Posters on site | GM3.3 | Highlight

Landslide Susceptibility within the binomial Generalized Additive Model

Marco Loche, Massimiliano Alvioli, Ivan Marchesini, and Luigi Lombardo

We develop a slope-unit based landslide susceptibility model using the benchmark dataset proposed in the session, located in Central Italy. As a result, we produce two susceptibility maps based on the two different landslide presence attribute fields included in the dataset.

The proposed dataset is a subset of a much larger one, recently used to obtain landslide susceptibility all over Italy. We further explore the differences between results obtained from the proposed dataset, and landslide susceptibility obtained at national scale. The national scale results were obtained in a Bayesian version of a binomial Generalized Additive Model (GAM) in R-INLA, an R implementation of the integrated nested Laplace approximation for approximate Bayesian inference. The method can explain the spatial distribution of landslides using a family of Bernoulli exponential functions.

This allows us to estimate fixed effects and random effects, and to assess their associated uncertainty. The residual susceptibility maps and the most common correlations permit to measure the strength and direction of the relationships between models and to capture differences in susceptibility values across the study area. On their basis, we offer a convenient approach to evaluate the similarities in case of both represented landslide distributions.

We propose this modeling comparison for any susceptibility maps to evaluate the interpretability of the covariates and performances, where a large dataset may influence the susceptibility pattern over space.

How to cite: Loche, M., Alvioli, M., Marchesini, I., and Lombardo, L.: Landslide Susceptibility within the binomial Generalized Additive Model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3566, https://doi.org/10.5194/egusphere-egu23-3566, 2023.

EGU23-4851 | Posters virtual | GM3.3

Resolution of data, type of inventory and data splitting in machine learning-based landslide susceptibility mapping

Neelima Satyam, Minu Treesa Abraham, and Kunal Gupta

The use of machine learning (ML) approaches for developing landslide susceptibility maps (LSM) has gained wide popularity in the recent past. The choice of ML algorithms, spatial resolution, the ratio of train-to-test data, and the landslide conditioning factors are some of the crucial factors that decide the performance of the developed LSM. However, there are no formal guidelines on the selection of any of these factors, as the choice highly depends upon the study area. In most cases, site-specific comparative analysis are required to find the best-suited combination. Two case studies were conducted for parts of the Western Ghats in India to develop pixel-based LSM for Idukki and Wayanad districts. Five different ML algorithms, two different spatial resolutions, multiple train-to-test ratios and two different types of landslide inventory data were used for developing the best-suited LSM. After detailed analysis, it was observed that the random forest (RF) algorithm has resulted in the best-performing LSM for both regions. The effects of spatial resolution and data splitting were found to be different for different algorithms, and among all the factors considered, data splitting is found to be the least influencing factor.

How to cite: Satyam, N., Abraham, M. T., and Gupta, K.: Resolution of data, type of inventory and data splitting in machine learning-based landslide susceptibility mapping, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4851, https://doi.org/10.5194/egusphere-egu23-4851, 2023.

EGU23-5755 | ECS | Orals | GM3.3

A slope units based landslide susceptibility analyses using Weight of Evidence and Random Forest

Marko Sinčić, Sanja Bernat Gazibara, Martin Krkač, Hrvoje Lukačić, and Snježana Mihalić Arbanas

As identified by previous work, landslides present a significant hazard in the Umbria Region, Central Italy. We present a Weight of Evidence (WoE) and Random Forest (RF) approach for deriving landslide susceptibility maps (LSMs) for the defined slope units (SU) cartographic unit. Used input data in this study includes a layer containing 7360 SU with 26 landslide conditioning factors (LCFs) and two landslide presence flags. Namely, „presence1“ (P1) and „presence2“ (P2) describe 3594 and 2271 SU as unstable, respectively. LCFs were reclassified using Natural Breaks into 10 classes, followed by testing collinearity which resulted in selecting 11 for the further analyses. Unstable SU were randomly split in two equal sets, one for deriving LSMs, and the other for validation. Using only unstable SU for WoE, the landslide dataset applied in RF included additionally an equal amount of stable SU. Stable SU were randomly selected from the area which had excluded only the previously selected unstable SU, simulating a temporal inventory for landslide validation. The latter ensured application of the model to unseen data, as well as unbiased landslide dataset for training the model. Model evaluation and LSM validation included determining Area Under the Curve (AUC) for the LSM area defined with Cumulative percentage of study area in susceptibility classes and the Cumulative percentage of landslide area in susceptibility classes. For model evaluation, 50% of unstable SU were examined, whereas to validate it, the remaining 50% of unstable SU were used. For model classification parameters, all SU were used to define Overall Accuracy (OA) and a Hit Rate and False Alarm Rate curve for which AUC was calculated. RF model performed excellent, having 86.16 and 90.00 AUC values for P1 and P2 scenarios, respectively. Significantly worse, the WoE P1 and P2 scenarios have 62.09 and 69.41 AUC values, respectively. LSM validation on unseen data goes in favor of WoE with 60.46 (P1) and 66.17 (P2) AUC values, compared to 45.06 (P1) and 56.68 (P2) AUC values for RF, indicating a random guess prediction. Considering OA and AUC as classification parameters, OA values for P1 and P2 scenarios in RF are 74.36 and 77.60 whereas AUC values are 81.65 and 84.61. Significantly less, WoE method has 66.03 and 69.14 OA values for P1 and P2 scenario, respectively. Similarly, WoE AUC values for P1 is 74.09 whereas for P2 it is 77.07. Showing better results in all four studied parameters in both methods, we point out the P2 scenario as a better option for defining landslide datasets concerning the amount of unstable and stable SU. Due to having a relatively big portion of unstable SU in the input data we argue that classification parameters should be prioritized when choosing the optimal method and scenario, as they take to consideration both unstable and stable SU for the entire study area. Based on the conducted research, we suggest using RF due to better classification performance as an approach for landslide susceptibility analyses and future zonation in the study area.

How to cite: Sinčić, M., Bernat Gazibara, S., Krkač, M., Lukačić, H., and Mihalić Arbanas, S.: A slope units based landslide susceptibility analyses using Weight of Evidence and Random Forest, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5755, https://doi.org/10.5194/egusphere-egu23-5755, 2023.

EGU23-6053 | ECS | Posters on site | GM3.3

Landslide Susceptibility Mapping via binomial Generalized Additive Model

Gianvito Scaringi and Marco Loche

Developments of geostatistical models in landslide susceptibility mapping often do not consider interpretability, although this element has a reasonably fundamental importance on risk assessment. Last trends in machine learning demonstrate that enhancement of performances influences the interpretability of mechanical processes in geostatistical models, in which geomorphic causation is suddenly lost.

We took the benchmark dataset in central Italy as our study case, for which a complete inventory of landslides is available. We built two landslide susceptibility models using a Generalised Additive Model (GAM) with a slope-unit partitioning of the area (~4,100 km², comprising 7,360 slope units), and a set of 26 independent variables, with the aim of classifying the presence/absence of landslides.

We tested the capability of a binomial GAM through nonparametric smoothing functions to evaluate the interpretability of the covariates. Furthermore, we obtained satisfactory results in terms of performance with a reasonable compromise in the interpretability.

GAMs are very popular classifiers in landslides susceptibility and even though other methods yield better performance, we suggest that interpretability in geostatistical analyses should proceed in tandem with improving the models’ performances.

How to cite: Scaringi, G. and Loche, M.: Landslide Susceptibility Mapping via binomial Generalized Additive Model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6053, https://doi.org/10.5194/egusphere-egu23-6053, 2023.

EGU23-6259 | ECS | Orals | GM3.3

HKUST-Landslide Susceptibility Dataset (HKUST-LSD): A benchmark dataset for landslide susceptibility assessment in Hong Kong

Haojie Wang, Limin Zhang, and Lin Wang

Rain-induced natural terrain landslides are the most frequent geo-hazard in many regions of the world. As an essential tool in addressing rising landslide challenges due to climate change, landslide susceptibility assessment has been widely investigated in Hong Kong for over twenty years. However, a public dataset for Hong Kong landslide susceptibility assessment is currently absent in the geoscience research community, which brings difficulties in establishing consistent evaluation criteria for testing any new method or theory. Thus, to facilitate the development of new statistical and/or artificial intelligence-based methods for landslides susceptibility assessment, here we compile the first version of The Hong Kong University of Science and Technology – Landslide Susceptibility Dataset (HKUST-LSD) based on multiple sources of open data. Aiming at comprehensively describing the rain-induced natural terrain landslide conditioning factors in Hong Kong, HKUST-LSD v1.0 comprises data of (a) a landslide inventory; (b) a high-resolution digital terrain model (DTM) and its topographical derivatives; (c) superficial geology, distance to faults and rivers/sea; (d) historical maximum rolling rainfall and (e) ground vegetation condition. HKUST-LSD v1.0 provides a ready-to-use dataset that includes processed landslide and non-landslide samples, together with reference codes that utilized representative machine learning techniques to assess the landslide susceptibility in Hong Kong and achieved satisfactory performance. The dataset will be updated on a regular basis to fulfil the latest research needs that might arise in the research community and support global sustainable development.

Download the dataset at: https://github.com/cehjwang/HKUST-LSD

How to cite: Wang, H., Zhang, L., and Wang, L.: HKUST-Landslide Susceptibility Dataset (HKUST-LSD): A benchmark dataset for landslide susceptibility assessment in Hong Kong, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6259, https://doi.org/10.5194/egusphere-egu23-6259, 2023.

EGU23-6937 | ECS | Posters virtual | GM3.3

Co-seismic landslide susceptibility analysis for the Bhagirathi valley of Uttarakhand Himalayan region using machine learning algorithms based on Slope unit techniques

Neha Gupta, Debi Prasanna Kanungo, and Josodhir Das

High-magnitude earthquakes are often in seismic zones that initiate the cascading chain of hazards such as co-seismic landslides, soil liquefaction, snow avalanche, surface faulting, devastating rock avalanches, and ground shaking. In the present study, a co-seismic landslide susceptibility analysis was executed for the Bhagirathi valley of Uttarakhand Himalayan region using machine learning techniques based on the slope unit-based method. The study area falls in seismic zone IV, rocks along the fault zone are fragile, and this area is very active seismically. This region has previously experienced Uttarkashi earthquake (1991) of magnitude 6.6. Assessment of seismic induced landslide is considered a complex process, as it considers both static parameters (causative factors) and dynamic parameters (triggering factor) in the form of ground motion shaking effects. In this study, the co-seismic landslide susceptibility maps using the machine learning approach Extreme Gradient Boosting (XgBoost) and Naïve Bayes (NB) techniques have been carried out at Slope Unit-based mapping level. The landslide inventory with 3,000 delineated polygons has been classified into training (80%) and testing (20%) data to calibrate and authenticate the models. For this purpose, static causative factors have been considered, such as slope, aspect, curvature, lineament buffer, drainage buffer, geology, topographic wetness index, and normalized difference vegetation index (NDVI), these parameters have been generated using the CartoDEM and satellite data. Triggering factors Arias Intensity (AI) has been considered for ground motion shaking as a dynamic factor for co-seismic landslides susceptibility mapping. Arias Intensity was prepared using the classical Cornell approach by considering the earthquake catalogue between the years 1700 and 2022. Finally, XgBoost and NB techniques have been used to compute static landslide susceptibility mapping and dynamic co-seismic landslide susceptibility map for a 475-year return period. XgBoost methods at the slope unit level predicted better results. These results were validated using the seismic relative index (SRI) and landslide density method. The prepared map can be effectively helpful for local and regional planning.

Keywords: Co-seismic landslide, Slope Unit, Landslide mapping, Machine learning.

How to cite: Gupta, N., Kanungo, D. P., and Das, J.: Co-seismic landslide susceptibility analysis for the Bhagirathi valley of Uttarakhand Himalayan region using machine learning algorithms based on Slope unit techniques, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6937, https://doi.org/10.5194/egusphere-egu23-6937, 2023.

EGU23-7907 | Orals | GM3.3

Comparison of the effectiveness of application of GAMs for landslide susceptibility modelling in Apennine and Alpine areas

Corrado Camera and Greta Bajni

The aim of this study is to contribute to the introduction of a benchmark dataset for landslide susceptibility. The contribution consists in the application of Generalized Additive Models (GAMs) on the test area proposed by Alvioli et al. (2022), located in Central Italy (Umbria Region, 4095 km²), and over the Mountain Communities of Mont Cervin and Mont Emilius (670 km²), located in the central part of Valle d’Aosta Region. In the latter, previous studies regarding landslide susceptibility were carried out by Camera et al. (2021) and Bajni (2022).

The susceptibility analysis is based on slope units for both areas and it uses the open-source dataset available for Italy (https://geomorphology.irpi.cnr.it/tools/slope-units, Alvioli et al., 2020). For Central Italy, predictors and response variable are those made available by Alvioli et al. (2022). For consistency, for Valle d’Aosta morphometric variables were calculated from the EUDEM digital elevation model (Copernicus Land Monitoring Service, 25 m horizontal resolution), while soil-related variables – namely soil depth, soil bulk density and particle size fractions - were derived from the SoilGrid global dataset (Hengl et al. 2017). In addition, coherently with Alvioli et al. (2022), two presence/absence landslide response variables (‘1’/’0’) were defined. For the first one, ‘presence1’, a slope unit was considered impacted by landslides (‘1’) if at least an event was recorded within its limits. For the second one, ‘presence2’, a slope unit was considered impacted by landslides (‘1’) if two or more landslides occurred within its limits. For Valle d’Aosta, landslide events were accessed through the regional inventory (http://catastodissesti.partout.it/), which is updated continuously by the Regional Civil Protection Department and the Forest Corps through regular surveys or following warnings from citizens.

Two landslide susceptibility maps were calculated for each area (‘presence1’, ‘presence2’). GAMs were applied through the mgcv library of R, with and without the option of variable selection through shrinkage. In addition, predictors behavior was analyzed through the associated Component Smoothing Functions (CSF) to check for physical plausibility. Finally, to evaluate uncertainties, a non-spatial k-fold cross-validation was carried out and a model evaluation was performed based on contingency tables, area under the receiver operating characteristic curve (AUROC) and variable importance (decrease in explained variance).

By the application of the same modelling algorithm (GAM) with an input dataset derived from the same data sources, the study is expected to verify the consistency of the obtained landslide susceptibility results in terms of both model performance and main driving processes (predictors).

References

Alvioli et al., 2020. Parameter-free delineation of slope units and terrain subdivision of Italy. Geomorphology 258, 107124. https://doi.org/10.1016/j.geomorph.2020.107124

Alvioli et al., 2022. Call for collaboration: Benchmark datasets for landslide susceptibility zonation. https://doi.org/10.31223/X52S9C

Bajni, 2022. Statistical methods to assess rockfall susceotibility in an Alpine environment: a focus on climatic forcing and geomechanical variables. https://doi.org/10.13130/bajni-greta_phd2022-03-23

Camera et al., 2021. Introducing intense rainfall and snowmelt variables to implement a process-related non-stationary shallow landslide susceptibility analysis. Science of The Total Environment 147360. https://doi.org/10.1016/j.scitotenv.2021.147360

Hengl et al., 2017. SoilGrids250m: Global gridded soil information based on machine learning. PLoS one 12, e0169748. https://doi.org/10.1371/journal.pone.0169748

How to cite: Camera, C. and Bajni, G.: Comparison of the effectiveness of application of GAMs for landslide susceptibility modelling in Apennine and Alpine areas, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7907, https://doi.org/10.5194/egusphere-egu23-7907, 2023.

EGU23-9623 | ECS | Posters on site | GM3.3

Can AI-generated landslide inventories replace humans' cognitive abilities in hazard and risk scenarios?

Sansar Raj Meena, Mario Floris, and Filippo Catani

Landslide inventories are quintessential for landslide susceptibility mapping, hazard modeling, and risk management. Experts and organizations all across the world have preferred manual visual interpretation of satellite and aerial imagery for decades. However, there are other issues with manual inventory, such as the subjective process of manually extracting landslide boundaries, the lack of sharing landslide polygons within the geoscientific community, and the amount of time and effort engaged in the inventory generation process by the expert interpreters. To address these challenges, a large amount of research on semi-automated and automatic mapping of landslide inventories has been conducted in recent years. The automatic development of landslide inventory using Artificial Intelligence (AI) approaches is still in its early stages, as there is currently no published study that can generate a ground truth representation of a landslide situation following a landslide-triggering event. In terms of landslide boundary delineation utilizing AI-based models, the evaluation metrics in recent research suggest a range of 50-80% of the F1-score. However, with the exception of those using model evaluation testing in the same studied area, very few studies claim to have attained more than 80% F1 score, that too at larger scales of investigation. As a result, there is currently a research gap between the generation of AI-based landslide inventory and their applicability for landslide hazard and risk assessments. There is a need to advocate for the geoscientific community to check the reliability of AI-generated landslide data in terms of their usage in the succeeding phases of landslide response and mitigation in impacted areas.

How to cite: Meena, S. R., Floris, M., and Catani, F.: Can AI-generated landslide inventories replace humans' cognitive abilities in hazard and risk scenarios?, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9623, https://doi.org/10.5194/egusphere-egu23-9623, 2023.

EGU23-9988 | Posters virtual | GM3.3

Comparing the performance of Machine Learning Methods in landslide susceptibility modelling

Paraskevas Tsangaratos, Ioanna Ilia, and Aikaterini-Alexandra Chrysafi

Landslide phenomena are considered as one of the most significant geohazards with a great impact on the man-made and natural environment. If one search the scientific literature, the most studied topic in landslide assessments is the identification of areas that potentially may exhibit instability issues by modelling the influence of landslide-related variables with methods and techniques from the domain of knowledge and data-driven approaches. This is not an easy task, since the complexity, and in most cases the unknown processes that are responsible for the evolution of landslide phenomena triggered either of natural or man-made activities, influence their performance. Landslide susceptibility assessments, which models the spatial component of the evolution of landslides are the most reliable investigation tool capable of predicting the spatial dimension of the phenomenon with high accuracy. During the past two decades, artificial intelligence methods and specifically machine learning algorithms have dominated landslide susceptibility assessments, as the main sophisticated methods of analysis. Fuzzy logic algorithms, decision trees, artificial neural networks, ensemble methods and evolutionary population-based algorithms were among the most advanced methods that proved to be reliable and accurate.

In this context, the main objective of the present study was to compare the performance of various Machine Learning models (MLm) in landslide susceptibility assessments. Concerning the followed methodology, it could be separated into a five-phase procedure: (i) creating the inventory map, (ii) selecting, classifying, and weighting the landslide-related variables, (iii) performing a multicollinearity, an importance analysis (iv) implementing the developed methodology and testing the produced models, and (v) comparing the predictive performance of the various models. The computational process was carried out coding in R and Python language, whereas ArcGIS 10.5 was used for compiling the data and producing the landslide susceptibility maps.

In more details, Logistic Regression, Support Vector Machines, Random Forest, and Artificial Neural Network were implemented, and their predictive performance were compared. The efficiency of the MLM was estimated for an area of northwestern Peloponnese region, Greece, an area characterized by the presence of numerous landslide phenomena. Twelve landslide-related variables, elevation, slope angle, aspect, plan and profile curvature, topographic wetness index, lithology, silt, sand and clay content, distance to faults, distance to river network and 128 landslide locations, were used to produce the training and test datasets. The Certainty Factor was implemented to calculate the correlation among the landslide-related variables and to assign to each variable class a weight value. Multi-collinearity analysis was used to estimate the existence of collinearity among the landslide related variables. Learning Vector Quantization (LVQ) was used for ranking features by importance, whereas the evaluation process involved estimating the predictive ability of the MLm via the classification accuracy, the sensitivity, the specificity and the area under the success and predictive rate curves (AUC). Overall, the outcome of the study indicates that all MLm provided high accurate results with the Artificial Neural Network approach being the most accurate followed by Random Forest, Support Vector Machines and Logistic Regression.

How to cite: Tsangaratos, P., Ilia, I., and Chrysafi, A.-A.: Comparing the performance of Machine Learning Methods in landslide susceptibility modelling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9988, https://doi.org/10.5194/egusphere-egu23-9988, 2023.

EGU23-11051 | Orals | GM3.3

Cross validation technique preference for landslide susceptibility zoning based on slope unit and machine learning workflow

Guruh Samodra, Erwin Eko Wahyudi, and Nanang Susyanto

Numerous advanced techniques including machine learning models are widely used in landslide susceptibility zoning which result in very high accuracy. In some cases, very high accuracy represents an overfitting in the model, where a model adapts very well to the training data but poorly for the test or new data. Cross Validation (CV) strategies are often employed to reduce overfitting in a machine learning model. Several cross validation techniques have been developed recently as a part of machine learning workflow. However, the preference of choosing one cross validation method to another is still unclear in landslide susceptibility zoning. To illustrate this issue, the authors reproduce non CV, standard V-fold CV, and several spatial CV techniques using a benchmark dataset in Italy to train, validate and test an XgBoost model using 26 landslide controlling factors. The variation of RoC validation, RoC testing, and confusion matrix were used to detect the potency of model overfitting. The preference of using a CV technique for a benchmark data in Italy will be discussed further. The result is expected to provide guidance for choosing CV technique in landslide susceptibility zoning based on slope unit and machine learning workflow.

How to cite: Samodra, G., Wahyudi, E. E., and Susyanto, N.: Cross validation technique preference for landslide susceptibility zoning based on slope unit and machine learning workflow, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11051, https://doi.org/10.5194/egusphere-egu23-11051, 2023.

EGU23-11586 | Orals | GM3.3

Bayesian logistic regression and optimized XGBoost models for landslide susceptibility assessment

Benjamin Mirus and Jacob Woodard

Bayesian logistic regression with vague priors and optimized XGBoost models are two contrasting and commonly used approaches for modeling landslide susceptibility. Logistic regression calculates the log odds of a binary outcome (i.e., landslide or no landslide) given some predictor data (e.g., slope, elevation, and geology) that describes the terrain of each mapping unit used to divide the terrain for susceptibility evaluation. The Bayesian implementation incorporates uncertainty into the model by using probability distributions of the model parameters. Weakly informative priors ensure that the likelihood function (i.e., observational data) dominates posterior distributions, which can be estimated using the statistical software Stan. Like logistic regression, the gradient boosting decision tree machine learning algorithm XGBoost requires the predictor data of each mapping unit to output a probability of an event. Decision trees are a non-parametric learning tool that uses a set of if-then-else decision rules to predict the expected model outcome. Gradient boosting is a method of sequentially adding more decision trees to improve the model output until the lowest model residual levels are reached while penalizing for the level of complexity added to the model. We optimize the model parameters using a Bayesian cross-validation procedure on a portion of the training data. To obtain distributions of the level of susceptibility from XGBoost, a 10-fold cross-validation procedure with ten iterations is implemented. Evaluation of both Bayesian logistic regression and XGBoost algorithms is performed using the area under the curve of the receiver operator characteristics and the Brier score, but any other common metric for evaluation is possible. Model development and evaluation is carried out through the computational environment R. These methods have been applied with success to many diverse regions of the United States and would benefit from testing with the benchmark datasets proposed by the conveners.

How to cite: Mirus, B. and Woodard, J.: Bayesian logistic regression and optimized XGBoost models for landslide susceptibility assessment, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11586, https://doi.org/10.5194/egusphere-egu23-11586, 2023.

EGU23-12943 | ECS | Orals | GM3.3

Slope unit size matters - why should the areal extent of slope units be considered in data-driven landslide susceptibility models?

Mateo Moreno and Stefan Steger

Grid cells (GC) and slope units (SU) are the most common mapping units in landslide susceptibility modeling. SU-based models have recently gained popularity in the field because of the availability of user-friendly software and certain advantages over GC approaches. For example, SUs are often described as more geomorphologically meaningful, less sensitive to positionally inaccurate landslide data and more flexible in representing specific variables (e.g., binary vs. count responses). In contrast to GCs, SU sizes can vary considerably within a study area. Spatially varying mapping unit sizes may be accompanied by a spatially varying likelihood of a SU being affected by a landslide. We assume that larger SUs are more likely to be labeled as "landslide-affected" than smaller SUs, which are just as susceptible to landslides simply because of their larger spatial extent. In other words, the larger the area of investigation, the more likely a landslide can be found. This may have relevant effects on subsequent landslide susceptibility models, especially if certain predictor variables correlate with SU sizes.

To our knowledge, the effects of different SU sizes on landslide susceptibility models have rarely been investigated, and no approaches to explicitly consider SU size have yet been presented. In this contribution, we use Generalized Additive Mixed Models (GAMM) to confront four different strategies for dealing with spatially varying SU sizes in landslide susceptibility modeling. The analyses focus on the provided SU-based dataset related to a part of the Umbria region in Central Italy (~4,100 km²). In the first strategy, all predisposing factors, including those directly related to SU size (i.e., SU area and distance/SU area), are used for model fitting and spatial prediction. The second strategy builds upon strategy 1, but it does not consider the size of the SUs for model fitting and spatial prediction. The third strategy demonstrates the ability of SU size to discriminate SUs with landslides from those without landslides and consists of a single-variable model with the area of the SUs as its only predictor. Then, in the fourth strategy, all predictors are used for model fitting, but the effect of SU size is averaged out from the spatial prediction (i.e., the size effect is not predicted into space, but its potentially confounding effect is isolated during the model fitting).

The first tests support the assumption that larger SUs are more likely labeled as landslide-affected SUs and that associated confounding effects should be considered in landslide susceptibility modeling. We present the four strategies in terms of modeled relationships, relative variable importance, spatial prediction pattern and quantitative validation results.

How to cite: Moreno, M. and Steger, S.: Slope unit size matters - why should the areal extent of slope units be considered in data-driven landslide susceptibility models?, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12943, https://doi.org/10.5194/egusphere-egu23-12943, 2023.

EGU23-13362 | Orals | GM3.3

Exploring the benchmark dataset for tasks related to landslide susceptibility assessment

Jewgenij Torizin and Nick Schüßler

In the presented study, we investigate the possibilities of performing tasks related to landslide susceptibility assessment (LSA) on the provided benchmark dataset. The slope unit-based dataset consists of aggregated predisposing factors and two label sets. Although initially introduced as a dataset for binary classification tasks, it is also suitable for zoning and regression analysis in combination with the underlying landslide inventory. Zoning ranks slope units to delineate the study area in susceptibility zones. In the regression analysis, we try to predict a numeric target value (e.g., landslide count) by the slope unit's attributes.

We explored the benchmark dataset using bivariate and multivariate statistical visualization techniques to understand the data relations better. We found the dataset at this stage insufficient for achieving a well-explainable high-performance classification using linear models. Most attributes are not specific to linearly separate the given labels. The chosen central tendency statistics (mean and standard deviation) may not characterize the parameter distributions inside the slope unit sufficiently.

We propose a theoretical concept for zonation analysis to assess the best possible performance on the given discrete dataset using the success rate curve as the model evaluation metric. Because any applied algorithm cannot modify the geometry of the discrete slope units, the evaluation metric only depends on the relative ranking of slope units. The best performance is obtainable without computing a predictive model. For frequency-related models (weighting of factors with landslide count statistics), a simple direct computation of conditional probabilities or frequency ratio on the slope units as a ranking factor provides the best possible ranking. Combining the label and slope unit's area provides the best slope unit ranking for binary labels.

We conducted a regression and classification analysis with artificial neural networks (ANN) testing different combinations of parameters (sensitivity analysis) architectures allowing for modeling nonlinear relations. In both analyses, initial results show that a complex net architecture can boost the model fit on the training dataset by losing predictive performance on test data. Also, the dataset pre-exploration corresponds well with the sensitivity analysis with ANN. The number of parameters is reducible to few effective predictors without losing much accuracy in classification, which is poor-to-moderate depending on the utilized label set.

While slope units as an aggregation for geomorphological analyses remain undisputed, the proposed aggregation of predisposing factors in slope units at the analysis's entry point needs further discussion. Aggregating the results of a raster-based LSA to overcome deviances in landslide susceptibility patterns caused by data uncertainties or different methods could be more suitable at this point. Slope units should be analyzed with regression analysis in LSA to consider their different spatial extents during the calculation.

We provide our scripts, visualizations, and results as a Jupyter Notebook on our GitHub: https://github.com/BGR-EGHA/EGU23_GM3.3_ls_benchmark.

How to cite: Torizin, J. and Schüßler, N.: Exploring the benchmark dataset for tasks related to landslide susceptibility assessment, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13362, https://doi.org/10.5194/egusphere-egu23-13362, 2023.

EGU23-16251 | Orals | GM3.3

Ensemble learning on the benchmark dataset for landslide susceptibility zonation in Central Italy

Héctor Aguilera, Jhonatan Steven Rivera Rivera, Carolina Guardiola-Albert, and Marta Béjar-Pizarro

In response to the call for collaboration, we aim to develop landslide susceptibility maps for the benchmark study area using Ensemble Machine Learning. Ensemble Learning has proven succesful for landslide susceptibility mapping in highly susceptible Asian regions of South Korea (Kaavi et al., 2018) and China (Hu et al., 2020).

The benchmark dataset provided, encompassing 7360 slope units in the central region of Italy, has 26 morphometric and thematic attributes, and two binary targets indicating the presence (1) or absence (0) of landslides. The first binary variable is balanced with respect to the number of zeros and ones (target 1) and the second in terms of the area covered by slope units labeled either with zero or one (target 2). For each of the two conditions in the dataset, we will compare the performance of individual classifiers such as logistic regression, naive bayes, decision trees, k-nearest neighbors, support vector machine, neural networks, as well as bagging (e.g., random forest) and boosting (e.g., extreme gradient boosting, CatBoost) algorithms using cross-validation. Then the best most diverse models will be selected based on typical performance metrics such as AUC and Matthews Correlation Coefficient (MCC), fine-tuned, and combined using stacking and blending Ensemble Learning techniques.

The best model will be re-trained with different configurations of training and test sets to derive a distribution of errors to add a measure of uncertainty in each slope unit of landslide susceptibility maps. Further, we will develop a landslide susceptibility index based on the results (e.g., probability distributions of the outcomes) to represent quantile-based susceptibility maps.

This work has been developed thanks to the pre-doctoral grant for the Training of Research Personnel (PRE2021-100044) funded by MCIN/AEI/10.13039/501100011033 and by "FSE invests in your future" within the framework of the SARAI project "Towards a smart exploitation of land displacement data for the prevention and mitigation of geological-geotechnical risks" PID2020-116540RB-C22 funded by MCIN/AEI/10.13039/501100011033.

How to cite: Aguilera, H., Rivera Rivera, J. S., Guardiola-Albert, C., and Béjar-Pizarro, M.: Ensemble learning on the benchmark dataset for landslide susceptibility zonation in Central Italy, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16251, https://doi.org/10.5194/egusphere-egu23-16251, 2023.

EGU23-1059 | ECS | Orals | EMRP1.6

Seismic reflectivity of fractures: the impact of secondary connected fractures

Edith Sotelo, J. German Rubino, Nicolas D. Barbosa, Santiago G. Solazzi, and Klaus Holliger

Fractures are ubiquitous through out the Earth's upper crust and dominate the mechanical and hydraulic properties of the affected rock masses. Indeed, open fractures act as fluid conduits and, commonly, flow is controlled by larger fractures, which are, in turn, likely to be connected to smaller ones. Therefore, fracture characterization is of paramount importance for many pertinent applications, such as geothermal energy production, CO₂ sequestration, nuclear waste storage, and hydrocarbon exploration. Seismic reflection methods are useful tools for fracture characterization due to the generally high reflectivity that large fractures exhibit as a consequence of their strong mechanical contrast with the embedding intact background. The magnitude of this mechanical contrast is known to be strongly affected by fracture-to-background wave-induced fluid pressure diffusion (FPD). Conversely, the FPD effects associated with secondary connected fractures remain so far unexplored. We investigate the influence of FPD on the normal compliance and on the vertical incidence PP reflectivity of a large fracture that is hydraulically connected to smaller fractures. To this end, we use several models that consist of an infinite horizontal main fracture connected to multiple vertical secondary fractures of finite length. This fracture system is embedded in impermeable background. The individual models differ only with regard to the geometrical (e.g., length and aperture), and physical properties (e.g., permeability and bulk modulus) of the secondary fractures. For comparison, we also calculate the normal compliance and the reflectivity of an isolated infinite horizontal fracture. To assess the changes of fracture compliance due to FPD, we perform a vertical compressional oscillatory test over samples of the aforementioned models that include part of the fracture system and the embedding background. This test simulates the FPD effects that a vertically propagating P-wave generates between the main and secondary fractures. Specifically, the wave produces a pressure increase in the horizontal fracture that equilibrates as fluid flows into the secondary vertical fractures. Based on this oscillatory test, we compute the average of the vertical components of strain and stress over the main fracture, which we use to estimate its normal compliance. We then proceed to calculate the PP reflectivity at normal incidence using its inferred P-wave modulus. Our results show that both the compliance and the PP reflectivity of the main fracture increase as much as two-orders of magnitude in response to the presence of secondary fractures. We also find that the physical and geometrical properties of the secondary connected fractures have an influence on the normal compliance and reflectivity of the main fracture.

How to cite: Sotelo, E., Rubino, J. G., Barbosa, N. D., Solazzi, S. G., and Holliger, K.: Seismic reflectivity of fractures: the impact of secondary connected fractures, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1059, https://doi.org/10.5194/egusphere-egu23-1059, 2023.

EGU23-1115 | ECS | Posters virtual | EMRP1.6

Bauxite Reservoir Characteristics of Taiyuan Formation in Zhengning Area of Southwest Ordos Basin

hao yuan Li

Abstract：The bauxite reservoir of the new type Taiyuan Formation in Zhengning area, southwest of Ordos Basin, is affected by the karst palaeogeomorphology, and its thickness varies greatly.In order to systematically study bauxite, a new type of reservoir, based on core observation, microscopic thin section, high-pressure mercury injection, low-temperature nitrogen adsorption and other experimental methods, the petrological characteristics and pore structure characteristics of bauxite reservoir were studied, which further verified the significance of reservoir exploration.The results show that: (1) the upper and lower parts of the reservoir are bauxite mudstones, and the middle part is argillaceous bauxite. The relatively developed dissolution pores are the main storage space of bauxite; (2) The bauxite minerals of Taiyuan Formation are mainly composed of aluminum minerals and clay minerals. The main minerals are diaspore, kaolinite, illite and chlorite; (3) Bauxite reservoir space is mainly composed of intragranular dissolved pores, matrix dissolved pores, intergranular dissolved pores, intergranular pores and microcracks, with the pore size mainly between 20 and 200 μm； The pore diameter of the main throat of the reservoir is 150 nm~4μm. The pore structure is good, mercury removal efficiency is high, and the overall pore throat is mainly submicron to micrometer; The average physical porosity of the reservoir is 10.6%, and the average permeability is 4.04×10^-3μm. Greater than 0.3×10^-3μm 36% of them are above m, and the reservoir conditions are good.The research results provide a basis for bauxite gas exploration in Ordos Basin.

How to cite: Li, H. Y.: Bauxite Reservoir Characteristics of Taiyuan Formation in Zhengning Area of Southwest Ordos Basin, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1115, https://doi.org/10.5194/egusphere-egu23-1115, 2023.

EGU23-1152 | ECS | Posters virtual | EMRP1.6

Shale oil mobility and pore size-associated wettability under capillary pressures

Na Yin

The capillary force shows great potential to improve the recovery of shale oil and gas reservoirs through spontaneous imbibition. However, the mechanism of capillary force on shale oil migration and its controlling factors are still unclear. By NMR, low-temperature nitrogen adsorption, high-pressure mercury injection and other experimental means, this work attempts to investigate the role of capillary force in improving shale oil recovery. The results show that the nuclear magnetic resonance T₂ spectra obtained through spontaneous imbibition can be divided into three types, and the shale oil recovery can reach 38.72% - 65.52%, which is mainly contributed by the first peak (P1). The water imbibition and oil imbibition experiments were carried out on samples of the same size, and the dynamic wettability index of the samples with the spontaneous imbibition time was calculated. It was found that type 1 shale is mainly lipophilic, type 2 and type 3 samples are mainly hydrophilic, the P1 of three types of shale is hydrophilic to neutral, and the water imbibition volume of the three samples was greater than the oil imbibition volume. In addition, by comparing the relationship between pore throats and pores and combining the structural characteristics of samples, three typical types of pore throats are summarized. Finally, through a comprehensive study on the wettability, pore structure of shale and shale oil recovery , it is concluded that water can drive oil droplets in micropores or pore throats (P1) to enter the mesopore (P2), and then the mesopore (P2) transmits the oil to the fractures by transfering pressure difference, and the oil-water distribution pattern before and after spontaneous imbibition under the effect of capillary force is summarized to provide theoretical basis for shale oil exploration. and development.

How to cite: Yin, N.: Shale oil mobility and pore size-associated wettability under capillary pressures, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1152, https://doi.org/10.5194/egusphere-egu23-1152, 2023.

EGU23-1838 | ECS | Orals | EMRP1.6

Organic matter matters - The imaginary conductivity of sediments rich in solid organic carbon

Cora Strobel, Manuel Dörrich, Olaf A. Cirpka, Johan A. Huisman, and Adrian Mellage

Solid organic matter (SOM) is an important component of natural sediments and plays a crucial role in providing substrate for microbial reactions and the degradation of contaminants in soil and groundwater. Knowledge about its distribution in the subsurface is crucial for the delineation of potential hotspots of microbial activity. The subsurface is, however, difficult to access, limiting our ability to reliably delineate the spatially heterogeneous distribution of SOM. Recently, the geophysical method induced polarization (IP) has been shown to be a potentially promising mapping tool, able to detect the presence of SOM. However, the mechanisms controlling IP signals in the presence of SOM are not (yet) well understood, with a handful of studies highlighting inconclusive results (Katona et al., 2021; Mellage et al., 2022; Ponziani et al., 2012; Schwartz & Furman, 2014). Moreover, a non-negligible contribution of polarization from the organic matrix can yield signals that may cause misinterpretation of other petro-physical relationships in unconsolidated sediments.

In this study, we measured the spectral IP (SIP) response of aquifer sediment cores (2 – 8 m depth) collected from an alluvial floodplain aquifer in southwest Germany. The total organic carbon (TOC) content in the cores and the cation exchange capacity (CEC) exhibit a positive correlation with the magnitude of polarization (i.e. imaginary conductivity). In addition, strong differences in the frequency dependence of the IP measurements as a function of TOC fraction were observed for the otherwise calcareous matrix devoid of other strongly polarizing mineral phases (e.g. pyrite or clay minerals). While the CEC at the site is strongly dominated by the amount of SOM, polarization is more strongly linked to SOM than CEC. We hypothesize that the weaker correlation between SOM and CEC highlights the contribution of poorly understood charge storage mechanisms within the polydisperse organic matrix that differ from polarization at mineral surfaces. Ongoing experiments with artificial soil mixtures of calcitic sand and varying fractions of peat, under controlled conditions (i.e. constant electrical conductivity of the pore fluid), will help to shed light on the controls behind our field-derived relationships. We expect that our combined field and laboratory investigations will provide insights into the petro-, or rather, organo-physical relationship between SOM and the imaginary conductivity, and thus contribute to a conceptualization of the underlying polarization mechanisms in organic matrices.

References

Katona, T., Gilfedder, B. S., Frei, S., Bücker, M., & Flores Orozco, A. (2021). High-resolution induced polarization imaging of biogeochemical carbon-turnover hot spots in a peatland. Biogeosciences, 18(13), 4039–4058.

Mellage, A., Zakai, G., Efrati, B., Pagel, H., & Schwartz, N. (2022). Paraquat sorption- and organic matter-induced modifications of soil spectral induced polarization (SIP) signals. Geophysical Journal International, 229(2), 1422–1433. https://doi.org/10.1093/gji/ggab531

Ponziani, M., Slob, E. C., Vanhala, H., & Ngan-Tillard, D. (2012). Influence of physical and chemical properties on the low-frequency complex conductivity of peat. Near Surface Geophysics, 10(6), 491–501. https://doi.org/10.3997/1873-0604.2011037

Schwartz, N., & Furman, A. (2014). On the spectral induced polarization signature of soil organic matter. Geophysical Journal International, 200(1), 589–595. https://doi.org/10.1093/gji/ggu410

How to cite: Strobel, C., Dörrich, M., Cirpka, O. A., Huisman, J. A., and Mellage, A.: Organic matter matters - The imaginary conductivity of sediments rich in solid organic carbon, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1838, https://doi.org/10.5194/egusphere-egu23-1838, 2023.

EGU23-2834 | ECS | Orals | EMRP1.6

Access for free: How to get free-of-charge access to Dutch Earth scientific research labs through EPOS-NL

Richard Wessels and Ronald Pijnenburg

Access to top research equipment facilitates top research. However, the research equipment needed may not always be available within individual institutes, while access to external facilities may not in all cases be affordable. This restricts the research that any individual can do and hampers scientific breakthroughs, particularly across disciplines. To overcome this limitation, a collaborative infrastructure network was initiated: EPOS-NL (European Plate Observing System- Netherlands). EPOS-NL provides free-of-charge access to geophysical labs at Utrecht University and Delft University of Technology, both in the Netherlands, for research within rock physics, analogue modelling of tectonic processes, X-ray tomography and microscopy. These labs include capabilities for among others: A) Mechanical and transport testing at crustal stress, temperature and chemistry conditions; B) Analogue tectonic modelling, including dynamic model imaging in 2D and 3D; C) X-ray tomography at sub-µm resolution; and D) A correlative workflow for imaging and microchemical mapping, down to nm resolution. As such, these labs can provide you with the means and expertise for your research into the physical behavior of the Earth’s crust and upper mantle.

Access to EPOS-NL can be requested by applying to a bi-annual call, posted on www.EPOS-NL.nl. This involves submitting a short (1-2 page) research proposal. Research proposals are reviewed on the basis of feasibility and excellence, but generally have a high chance of success (~80% in previous rounds). Interested? Have a look on the EPOS-NL website – and apply!

How to cite: Wessels, R. and Pijnenburg, R.: Access for free: How to get free-of-charge access to Dutch Earth scientific research labs through EPOS-NL, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2834, https://doi.org/10.5194/egusphere-egu23-2834, 2023.

EGU23-3097 | ECS | Orals | EMRP1.6

Petroacoustic characterization of fractured and weathered limestone from the O-ZNS Critical Zone Observatory

Abdoul Nasser Yacouba, Céline Mallet, Jacques Deparis, Phlippe Leroy, Gautier Laurent, Mohamed Azaoural, and Damien Jougnot

In a context of energy transition and water resources crisis, studying the fluid flow in the critical zone appears to be a major issue. The O-ZNS (Observatory of transfers in the Vadose Zone, Orleans, France) site has been designed for the development of innovative tools that can characterize and monitor the dynamics of the vadose zone (VZ). The geological structure of this VZ is composed mainly by a lacustrine limestone formation located between 10 and 20 m-deep, characterized by multiscale heterogeneities (facies variations, presence of cracks, fractures, pores, cavities and karstification). In order to predict fluid flow, heat transfer, and aquifer recharge through this VZ, the limestone heterogeneities have to be integrated into geological concepts and numerical models.

This study is a key part of the O-ZNS project, as it aims at (i) understanding and classifying the microstructural and petrophysical properties at laboratory scale; (ii) predicting these properties through quantitative geophysical parameters and; (iii) developing new geophysical interpretations through coupled approaches.

From well logs analysis of O-ZNS site, we collected limestone samples from four main facies (with four samples per facies). We performed a state-of-the-art petrophysical characterization including connected and total porosity, density, and permeability measurements. Then we carried out acoustic measurements on dry and water-saturated plugs (2.5 and 4 cm diameter) with P- and S-waves at two frequencies 0.5 and 1 MHz.

The measurement results show a large dispersion of the petrophysical properties. For example, connected porosity ranges from 4 to 12 %, and density from 2,3 to 2,5 g/cm³. This dispersion of petrophysical properties is interpreted in terms of heterogeneity of the type of porosity (micro to cm pore size, presence of cracks and fracture) and mineralogy. However, it appears that the deepest facies (located at the aquifer level) is more homogenous and shows the highest porosity. This is consistent with directly observed (micro)structure from 3D sample and well scans.

Acoustic velocity results show coherent values for fractured limestone rocks. The different facies show dispersion, such as Vp varying from 4950 to 5600 m/s for the shallowest facies at 9 m-deep. Here also, the deepest facies appears to be the most homogeneous with the lowest velocities (around 4875 m/s). Thus, velocities are consistent with the petrophysical measurements and one can draw a simple relationship between the porosity, density and acoustic velocities. However, other petroacoustic relationships are necessary to better discriminate between each facies and therefore predict their microstructure and transport properties.

The following step of this work is to add electric measurements and develop petro-acoustico-electrical models and enhance our capacity to upscale these properties from the laboratory to the field.

How to cite: Yacouba, A. N., Mallet, C., Deparis, J., Leroy, P., Laurent, G., Azaoural, M., and Jougnot, D.: Petroacoustic characterization of fractured and weathered limestone from the O-ZNS Critical Zone Observatory, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3097, https://doi.org/10.5194/egusphere-egu23-3097, 2023.

EGU23-3347 | Posters on site | EMRP1.6

Estimation of normal and shear compliance for inclined fractures from full-waveform sonic log data

Zhenya Zhou, Eva Caspari, Nicolás D Barbosa, Marco Favino, and Klaus Holliger

Fractures are ubiquitous throughout the Earth’s upper crust and represent localized zones of mechanical weakness as well as preferential pathways for fluid flow. Correspondingly, their detection and characterization is vital for a wide range of pertinent applications in geological, civil, and environmental engineering, hydrocarbon exploration, nuclear waste and carbon dioxide storage, as well as geothermal energy production. Particularly important mechanical characteristics of fractures are their normal and shear compliances, which relate the displacement perpendicular and parallel to the fracture plane, respectively, to the corresponding components of the prevailing stress tensor. Based on the linear slip model, previous works developed a phase delay method to estimate the normal compliance of individual fractures using the P-wave first-arrivals in full-waveform sonic (FWS) log data. This approach is viable for a quasi-normal incidence scenario of the sonic wavefield. However, the conditions under which this technique remains valid at oblique P- and S-wave incidence angles as well as the role played by the combined effects of the normal and shear compliances remains enigmatic. To alleviate this problem, we have extended the phase delay technique to allow for non-normally-incident P- and S-waves. In addition to improving the accuracy of the normal compliance estimates with respect to the results computed under a normal incidence assumption, this method allows for a simultaneous estimation of the normal and shear compliances. The proposed approach has been validated through analytical tests and numerical simulations of wave propagation in a hard-rock-type borehole environment intersected by a single fracture with dip angles of 0, 30, and 40 degrees with regard to the horizontal. For fracture compliance values typical of mesoscale fractures (10^-14 to 10^-12m/Pa), the effects associated with oblique incidence become significant for dip angles larger than 50 and 30 degrees for P- and S-waves, respectively. However, our results also demonstrate that the normal incidence assumption can produce similar errors at even lower fracture dip angles in the presence of larger fracture compliance values and/or shear-to-normal compliance ratios. Finally, we apply the method to observed FWS data acquired in granitic rocks where the considered boreholes intersect fractures at a range of oblique angles. Direct in-situ estimates of compliances for discrete individual fractures are scarce, but essential to bridge the scale gap between laboratory estimates and input data for reservoir scale models. While recent studies show the feasibility of estimating normal compliances from FWS data, this study aims to explore whether and to what extent this approach can be practically extended to shear compliances and to the corresponding shear-to-normal compliance ratios.

How to cite: Zhou, Z., Caspari, E., Barbosa, N. D., Favino, M., and Holliger, K.: Estimation of normal and shear compliance for inclined fractures from full-waveform sonic log data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3347, https://doi.org/10.5194/egusphere-egu23-3347, 2023.

EGU23-3376 | Orals | EMRP1.6

The heterogeneous near-surface velocity structure of carbonate-hosted seismogenic fault zones investigated at different length scales: from ultrasonic measurements to subsurface seismic tomography

Michele Fondriest, Maurizio Vassallo, Stéphane Garambois, Thomas M. Mitchell, Di Giulio Giuseppe, Mai-Linh Doan, and Christophe Voisin

Field geological studies have revealed the heterogeneous structure of fault zones down to the sub-metric scale due to the juxtaposition of rocks presenting distinct deformation intensity and physical-transport properties. However, such internal variability is not generally resolved by most seismic tomography techniques due to spatial resolution limits. Quantifying the heterogeneous internal structure of fault zones is fundamental to understand their mechanical and hydrological characteristics. In this sense, determining seismic wave velocities and related physical properties (elastic moduli, porosity and fracture intensity) within fault zones, at different observational scales, is crucial.

Here, the near-surface velocity structure of two active seismogenic fault zones located in the Central Apennines of Italy was quantified at different length scales, from laboratory measurements of ultrasonic velocities (rock samples of few centimeters, 1 MHz source) to high-resolution first-arrival seismic tomography (spatial resolution of few meters). Detailed structural mapping was conducted within the Vado di Corno and Monte Marine fault zones, two NW-SE trending structures with length of ~ 15 km and up to 1.5 km of extensional displacement. Distinct structural units separated by fault strands were recognized in the fault zone footwall blocks cutting Mesozoic dolomitic carbonates: (i) fault core cataclastic units, (ii) breccia unit, (iii) high-strain damage zone, (iv) low-strain damage zone. The single units were systematically sampled along transects orthogonal to the average strike of the faults and characterized in the laboratory in terms of directional P and S ultrasonic wave velocities, porosity and microstructures. The fault core cataclastic units were significantly “slower” (V_P = 4.5±0.4 kms^-1, V_S = 2.7±0.2 kms^-1) compared to the damage zone units (V_P = 5.6±0.6 kms^-1, V_S = 3.2±0.3 kms^-1) at short length scales (i.e. few centimeters). A general negative correlation between ultrasonic velocity and porosity was observed, with some variability within the fault core mostly related to the textural maturity (clast/matrix volume ratio) of the fault rocks and the degree of pore space sealing by calcite cements.

Multiple P- and S-wave high-resolution seismic profiles (length 90-116 m, geophone spacing 1-1.5 m) were acquired across the two fault zones at different structural sites, moving from the principal fault surface into the outer damage zone. The derived first-arrival tomography models highlighted fault-bounded rock bodies with distinct velocities and characterized by geometries which well compared with those deduced from the structural mapping. At the larger length scale investigated by the active seismic survey, relatively “fast” fault core units (V_P ≤ 3.0 kms^-1, V_S ≤ 1.8 kms^-1) and very “slow” high-strain damage zones (V_P < 1.6 kms^-1, V_S < 1 kms^-1) were recognized. These velocity ranges were significantly different from those determined in the laboratory on small samples. This apparent discrepancy could be reconciled using an effective medium approach, considering the effect of mesoscale fractures density and size distributions affecting each structural unit.

This combined study highlighted the high petrophysical variability of carbonate-hosted fault zones, with structural units characterized by sharp contacts and different velocity scaling. In particular, the persistence of compliant high-strain damage zones at shallow depth might strongly affect near-surface deformation.

How to cite: Fondriest, M., Vassallo, M., Garambois, S., Mitchell, T. M., Giuseppe, D. G., Doan, M.-L., and Voisin, C.: The heterogeneous near-surface velocity structure of carbonate-hosted seismogenic fault zones investigated at different length scales: from ultrasonic measurements to subsurface seismic tomography, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3376, https://doi.org/10.5194/egusphere-egu23-3376, 2023.

EGU23-3630 | ECS | Posters on site | EMRP1.6

Fluid diffusion and pore-pressure distribution in microcracked rocks

gang lin, Samuel Chapman, Jérôme Fortin, and Alexandre Schubnel

Pore pressure has a major influence on the effective stress and thus on the mechanical behaviour and the physical (elastic and transport) properties of microcracked rocks. In the field, in-situ measurements of pore-pressure is difficult outside of local measurements around boreholes. Yet, fluid migration is observed ubiquitously in the continental crust, whether in fault zones or in volcanic geothermal areas. In particular, pore pressure perturbations change the effective stress, which may lead to microseismic activity. This may also occur in conventional reservoirs, the storage of CO₂ or deep geothermal energy extraction.

In this study, we focus -in the laboratory- on the hydro-mechanical behavior of thermally treated Westerly granite and naturally microcracked Etna basalt samples (40 mm in diameter and 80 mm in length). The goal is to determine the pore pressure distribution and diffusion laws under different pore pressure gradients. First, classical (constant flow method) permeability measurements under small pore pressure gradient (1 MPa over the length of the sample) were carried out as a function of increasing confining pressures Pc (up to 70 MPa). The results show that permeability of samples varies exponentially with effective pressure, which is expected for cracks-porous rocks. The pressure sensitivity factor for permeability was then deduced to be of the order of 0.011~0.057 MPa^-1.

In a second step, permeability was measured at high (70 MPa) confining pressure, under large pore-pressure gradients (up to 60 MPa). During this part of the experiments, pore pressure was measured along the sample using newly developed fluid pressure sensors (with an absolute accuracy of +/-1MPa). Under small pore pressure gradient (2.5 MPa), our results show that the pore pressure varies linearly over the length of the sample, as expected from Darcy’s law and a constant permeability. However, with increasing pore pressure gradient (up to 60 MPa), the linearity is lost, as the permeability can no longer be assumed constant along the sample.

To interpret our results, we solved the diffusion equation, assuming that permeability varies exponentially with effective pressure. For steady state flow conditions, our observations of the pore pressure distribution in the samples are consistent with the theoretical predictions. In particular, we show that the shape of the pore-pressure distribution at steady-sate does not depend on permeability itself, but rather on the permeability pressure sensitivity factor: the larger the latter, the more non-linear the pore pressure in the samples.

How to cite: lin, G., Chapman, S., Fortin, J., and Schubnel, A.: Fluid diffusion and pore-pressure distribution in microcracked rocks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3630, https://doi.org/10.5194/egusphere-egu23-3630, 2023.

EGU23-4452 | Posters on site | EMRP1.6

IP signature of metallic particles: lessons learnt from field and laboratory experiments

Pauline Kessouri, Clémence Ryckebusch, Alejandro Fernandez-Visentini, and Lee D. Slater

Past metallurgical sites and deposits account for a significant proportion of potentially contaminated sites in the European Union (EU): about 100,000 have been identified only in the North West regions of the EU. While recent wastes from sites still in operation are commonly recovered, this is not the case for old aggregated materials with a high content of ferrous (and other) metals, white and black slag, etc., which are considered to be sources of pollution and are costly to manage or dispose of. These sites could be considered as opportunities to recover large volumes of resources (metals, materials and land) using urban mining techniques if they were better characterized.

The induced polarization (IP) method is a geophysical method known to be sensitive to the presence of various metallic particles disseminated in the soil layers. If qualitative interpretation of the measured IP parameters in the field (i.e. resistivity and chargeability) are widespread, quantitative interpretation in terms of concentrations of different metallic particles is yet to be developed.

The example of the Pompey field site (FR), investigated as part of the NWE-REGENERATIS project (https://www.nweurope.eu/projects/project-search/nwe-regeneratis-regeneration-of-past-metallurgical-sites-and-deposits-through-innovative-circularity-for-raw-materials/), is used in this study to present the interest in using time domain IP (TDIP) field measurements to characterize metallurgical past deposits. Several paths are explored to convert resistivity and chargeability TDIP tomographies into quantitative interpretation of metallic element concentrations: (1) extraction of frequency data from TDIP field measurements; and (2) upscaling of lab results through numerical simulations.

Regarding (1), TDIP measurement were made with different time windows (different frequencies), giving us access to spectral IP (SIP) processing and interpretation at 5 frequencies. These new frequency interpretations of the TDIP can be compared to lab measurements and facilitate the upscaling of the found petrophysical relationships.

Regarding (2), in order to interpret the TDIP results in terms of concentration of metallic particles, known petrophysical relationships and geochemical measurements obtained at the lab scale need to be interpreted at the field scale. We propose to use a Bayesian framework for inferring field-scale metallic particles concentrations, taking into account heterogeneity and anisotropy within the inversion schemes. This work is ongoing.

For both (1) and (2), it is crucial to find the best petrophysical relationships linking the IP parameters to concentration and size of metallic particles. Wong (1979) developed a physics-based electrochemical model that is still used today. We further investigate the Wong model to explore the role of the background porous medium itself in determining the IP signature of disseminated metallic particles and discuss the sensitivity of the model to estimate metallic grains concentration.

All these different research paths lead to a better understanding of metallic particles IP signature at a small scale, as well as discussions on how to use these findings to better characterize and reevaluate past metallurgical sites and deposits.

This study was funded by the North West Europe (NWE) Interreg project called NWE-REGENERATIS that aims at the regeneration of past metallurgic sites and deposits through innovative circularity for raw materials, and by Schlumberger-Doll Research Center (USA, MA).

How to cite: Kessouri, P., Ryckebusch, C., Fernandez-Visentini, A., and Slater, L. D.: IP signature of metallic particles: lessons learnt from field and laboratory experiments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4452, https://doi.org/10.5194/egusphere-egu23-4452, 2023.

EGU23-5997 | ECS | Posters on site | EMRP1.6

Quantification of electrical properties of deep crustal rocks based on their mineral modal proportion, fabric, and pressure-temperature conditions

Hadiseh Mansouri, Virginia Toy, Kevin Klimm, Nikolai Bagdassarov, Mattia Pistone, Andrew Greenwood, and György Hetényi

Electrical resistivity tomography and electromagnetic inverse modelling are particularly useful to explore orogenic systems because the most important conductive components of rock masses are economically-significant minerals (semi-metals like graphite, and semi-conducting minerals like sulphides), as well as certain clays and permeating saline fluids. Despite the efficiency of electrical measurements, anisotropic properties of the crust, which affect almost all acquired data, may lead to serious misinterpretation of the subsurface geology if they are ignored during data analysis. Understanding the geological causes of electrical anisotropy and heterogeneity, and considering their influence in field-scale electrical measurements, can provide crucial information on the crustal architecture, pore fluid network, as well as revealing the internal structure of fault zones, and increasing the accuracy of location of critical mineral deposits. To this end, we aim to quantify the electrical properties of mid- to lower-crustal metamorphic and magmatic lithologies based on their micro- to macrostructures, conductive components and fluid contents as measured by laboratory methods. Our research also contributes to, and advances, the likely outcomes of the ICDP-supported project DIVE (Drilling the Ivrea-Verbano ZonE). DIVE is currently exploring the hidden portions of the continental lower crust and crust-to-mantle transition zone of the Ivrea-Verbano Zone (Western Alps, Italy) in two boreholes at the sites of Megolo (DT-1a) and Ornavasso (DT-1b), separated by 7 km distance in Val d’Ossola. The first DIVE borehole, DT-1b, was completed in December 2022, reaching a depth of 578.5 metres, and rock cores of metapelite, gneiss, amphibolite, migmatite, and pegmatite were recovered. Some drillcores contained a range of potentially conductive lithologies, including sulphide- and graphite-bearing metapelites. In this research we are measuring electrical conductivity on a representative benchmark suite of bedrock outcrop samples from the region around the DIVE boreholes at elevated pressure and temperature. We are currently characterising the microstructural arrangement and distribution of conductive phases within these samples by electron beam methods. To properly understand electrical property measurements of the natural samples we determine the contributions of each key conductive phase (graphite and sulphides). The bulk resistivity of a mixture of quartz+10% graphite, which was synthesized in a solid–medium piston-cylinder apparatus, at temperature of 22.5 °C and pressure of 0.5 GPa, was found to be 1 Ω.m. No change in bulk resistivity was observed with increasing temperature up to 1000 °C. We will present the results of additional tests to be undertaken between January and April 2023 at this conference. Our data will be employed in interpretation of wireline electrical logs and borehole-to-surface electrical surveys from DT-1a and DT-1b.

How to cite: Mansouri, H., Toy, V., Klimm, K., Bagdassarov, N., Pistone, M., Greenwood, A., and Hetényi, G.: Quantification of electrical properties of deep crustal rocks based on their mineral modal proportion, fabric, and pressure-temperature conditions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5997, https://doi.org/10.5194/egusphere-egu23-5997, 2023.

EGU23-6640 | Orals | EMRP1.6

A petrophysical model for the spectral induced polarization of clays

Philippe Leroy, Alexis Maineult, Aida Mendieta, and Damien Jougnot

Clays are sedimentary minerals that are ubiquitous in the Earth’s continental crust. They have remarkable adsorption, catalytic and containment properties due to their high surface charge and very large specific surface area. However, their microstructural and electrochemical properties are not completely understood. In this study, we have developed a new petrophysical model to interpret laboratory spectral induced polarization measurements on kaolinite, illite and montmorillonite muds when salinity increases (from around 0.01 mol L^-1 to 1 mol L^-1 NaCl initially). Our model considers electrical conduction in the bulk and diffuse layer waters as well as polarization of the Stern layers of illite aggregates and Stern layers and interlayer spaces of Na-montmorillonite aggregates with different shapes and sizes. Maxwell-Wagner polarization was considered as well. By fitting predicted to measured SIP spectra, we found that the basal surface of clays controls Stern layer polarization and that the interlayer space of Na-montmorillonite may polarize in the mHz to kHz frequency range. Our study is a step forward to better understand the high surface conductivity response of clays inferred from resistivity and induced polarization measurements.

How to cite: Leroy, P., Maineult, A., Mendieta, A., and Jougnot, D.: A petrophysical model for the spectral induced polarization of clays, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6640, https://doi.org/10.5194/egusphere-egu23-6640, 2023.

EGU23-6657 | Orals | EMRP1.6

Predicting transport properties in porous and fractured media, how fractal-based models can help petrophysicists?

Damien Jougnot, Luis Guarracino, Mariangeles Soldi, Flore Rembert, Haoliang Luo, Santiago Solazzi, and Luong Duy Thanh

Since the great paradigmatic revolution initiated by Mandelbrot, we know that fractals are ubiquitous in nature. From coastlines to plant growth, fractal mathematics help us to describe and quantify many of nature’s properties. In the same way, the fractal theory can be applied to porous and fractured media. In recent decades, numerous research studies have shown that fractal theory provides a solid framework to describe the properties of geological media. Based on advanced physical knowledge at the microscale, it is possible to use fractal patterns to describe transport properties in porous and fractured media. Fractal laws can be applied to describe the size distribution of pores and fractures, fracture widths, and pore irregularities, but also to relate these pore sizes to pore tortuosities. In this contribution, we review the significant advances that have been made in the field of petrophysics by applying fractal mathematics to describe fundamental petrophysical properties such as porosity, permeability, electrical conductivity, thermal conductivity, and electrokinetic and electroosmotic coupling coefficients. These new petrophysical models are based on the upscaling procedure applied to different fractal objects such as the Sierpinski carpet, Koch curves, Pigeon holes, and Menger sponge, among others. Among the interesting results obtained by means of fractal-based petrophysics, one can derive transport properties of saturated or partially saturated media, above and below freezing temperature, and considering hysteretic behavior and reactive media dissolution/precipitation processes. Integrating these fractal-based petrophysical relationships into the laboratory or field-scale, numerical simulations are now opening a wide range of potential avenues for progress in near-surface and reservoir geophysics.

How to cite: Jougnot, D., Guarracino, L., Soldi, M., Rembert, F., Luo, H., Solazzi, S., and Thanh, L. D.: Predicting transport properties in porous and fractured media, how fractal-based models can help petrophysicists?, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6657, https://doi.org/10.5194/egusphere-egu23-6657, 2023.

EGU23-6941 | ECS | Posters on site | EMRP1.6

Reviewing numerical simulation methods of nuclear magnetic resonance signals in porous media.

Francisca Soto Bravo, Chi Zhang, and Lin Jia

Low-field nuclear magnetic resonance (NMR) is a minimally-invasive geophysical method often used to characterize pore spaces, water content, and fluid transport and distribution in geologic materials. NMR measurements are based on the magnetization and relaxation behavior of the spin magnetic moment of hydrogen atoms in external magnetic fields. These measurements can be taken in the field, such as from a borehole or the surface of the Earth, or in the laboratory using a bench-top apparatus. Numerical simulations of NMR signals are great tools to better understand the relaxation behavior of pore water under different scenarios, explore the effect of changes in the composition or geochemical characteristics of the geologic material, verify experimental findings, and improve the interpretation of field measurements. They can also be used to examine situations where traditional interpretation of NMR signals fails, such as in complex, heterogeneous geometries with pore coupling effects. In a pore coupled system, significant magnetization exchange between pores of different sizes occurs during the measurement time, which makes it difficult to independently characterize the pore environments. Using numerical simulations, it is possible to explore the factors that control pore coupling, such as surface relaxivity, pore-network connectivity and other pore-network characteristics, can be explored independently in a controlled setting. In this work, we introduce common numerical modelling approaches used for simulating NMR responses in geologic materials, along with their limitations and traditional workflows. We present two specific examples: a Random Walk (RW) simulation to test the effect of different pore-network connectivity features on pore coupling in a simplified pore geometry, and a Finite Element Method (FEM) simulation approach to visualize the distribution of magnetization density within a single pore. NMR is a promising hydrogeophysics tool gaining popularity and finding new applications for near-surface exploration. A better understanding of the NMR signals in diverse and complex scenarios is essential for the adequate design of experiments and field campaigns and for the correct interpretation of NMR measurements at different scales. The use of numerical modelling strategies can help improve this understanding, leading to more accurate and reliable measurements and interpretations.

How to cite: Soto Bravo, F., Zhang, C., and Jia, L.: Reviewing numerical simulation methods of nuclear magnetic resonance signals in porous media., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6941, https://doi.org/10.5194/egusphere-egu23-6941, 2023.

EGU23-7902 | ECS | Posters on site | EMRP1.6

Impact of chemical subprocesses during calcite precipitation in sandstones on the measured SIP response and their identification

Arne Marvin Mansfeld and Andreas Kemna

Interactions between mineral phases and fluids in the subsurface inevitably lead to mineral precipitation reactions and dissolution. While these processes are the major drive behind many geochemical changes in aquifer systems, their detection, monitoring and characterization is difficult. Geoelectrical methods provide potential to investigate precipitation and dissolution reactions in rocks non-invasively. However, with the measurement of the DC electrical conductivity alone, changes in pore water salinity, mineralogy, or pore space characteristics can hardly be differentiated. The ambiguity in the identification of these processes can be reduced by also measuring the spectral induced polarization (SIP) response, i.e., the frequency-dependent complex electrical conductivity, given the sensitivity of especially the imaginary component to textural and chemical characteristics. In order to assess the capability of this approach, we conducted multiple laboratory experiments on quartz-rich sandstone samples in which different precipitation scenarios were provoked under controlled conditions while monitored with SIP. The experimental setup consists of two reactant solutions in contact with both sides of the sample, leading to a reaction within the sample as diffusion from each side into the rock goes on. We used reactant solutions of NaHCO₃and CaCl₂in varying molality, the mixing of which in the sample’s pore space results in CaCO₃formation. By varying samples and solutions, three different components contributing to the complex conductivity response during the ongoing precipitation could be identified. The onset of the chemical reaction is clearly visible in the temporal evolution of imaginary conductivity at relatively low frequencies. The observed temporal peak can be associated with changes in the pH value due to the infiltration of the reactant at earlier times and the reduction in pH with calcite precipitation. This explanation is supported by additional experiments performed on a similar sample, where pH was altered by infiltration of NaHCO₃only. A second spectral high-frequency peak shows up at later stages of the experiments, suggesting that here the main changes of the pore surfaces in response to the precipitation are occurring. This phenomenon could not be recreated by using the infiltration of a pure electrolyte solution or the infiltration of NaHCO₃. The last component in the complex conductivity response is the continuous increase of the real component due to the increasing salinity of the pore water, which also could be reproduced in comparative measurements. Our results show the potential of complex conductivity measurements for precipitation monitoring in rocks, including improved textural and chemical characterization. Given the applicability of complex conductivity imaging at the field scale, the method thus holds promise for monitoring tasks in the context of, for example, carbon capture and storage, enhanced geothermal energy, soil stabilization, and capture of dissolved contaminants, which are of increasing societal relevance.

How to cite: Mansfeld, A. M. and Kemna, A.: Impact of chemical subprocesses during calcite precipitation in sandstones on the measured SIP response and their identification, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7902, https://doi.org/10.5194/egusphere-egu23-7902, 2023.

EGU23-8629 | Posters virtual | EMRP1.6

Oil saturation quantitative evaluation in lacustrine shale: A novel insight from NMR T1-T2 experiments

Shaolong Zhang, Jingong Cai, Jianping Yan, Xiaojun Zhu, and Min Wang

Oil saturation is important for shale reservior to identify favorable sections and mapping the geological sweet spot. Current oil saturation evaluation methods, including experiments and empirical formulas, are not suitable for shale reservoir because of the complex mineral, fluid components and pore structure characteristics. To establish the shale oil saturation calculation model, X-ray diffraction, one-dimension and two-dimension nuclear magnetic resonance (NMR), and oil-water two-phase displacement experiments were employed on shale samples collected in the upper sub-member of the fourth member of the Eocene Shahejie Formation in the Dongying sag, Jiyang Depression, Bohai Bay Basin. After data analysis, the reason for whether oil is produced in the displacement experiments were explained, distribution characteristics of different shale components in the NMR T1-T2 map were analyzed, and a new shale oil saturation calculation method was proposed using NMR T2 sensitive parameters that reflected the changes of NMR T2 spectrum morphological characteristics with different oil saturation calibrated by NMR T1-T2 map at different displacement stage. The results indicated that the pore structure of shale samples is complex and show strong heterogeneity according to the NMR T2 spectrum, and the distribution of shale pore size is the main factor determining whether there is oil in the volumetric cylinder in the displacement experiment under the premise of the slight difference of wettability. NMR T1-T2 map is an effective way to identify different components (kerogen and solid bitumen, adsorbed oil, free oil, structural and adsorbed water, free water) of shale samples, and usually, kerogen and solid bitumen distributed in the top left of the T1-T2 map with T1>10 ms, T2<0.1 ms. Based on this, T2 threshold for free oil and adsorbed oil are 2 and 0.2 ms, and the corresponding threshold of pore radius are 40 and 4 nm according to the NMR theory. As NMR T2 spectrum sensitive parameters, geometric mean and interval porosity corresponding to the first peak are positively and negatively correlated with oil saturation respectively. With understanding this, oil saturation calculation method is established using the above two parameters and the Root Mean Square Error (RESM) between the measure oil saturation and the calculated results is 5.78%, which reflecting the accuracy and validity of the method. In general, this method allows the shale oil saturation to be accurately calculated and provides a parameter basis for the determination of favorable sections and evaluation of resource of shale oil reservoir. Moreover, it also offers a new idea for the oil saturation predication by NMR logging.

How to cite: Zhang, S., Cai, J., Yan, J., Zhu, X., and Wang, M.: Oil saturation quantitative evaluation in lacustrine shale: A novel insight from NMR T1-T2 experiments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8629, https://doi.org/10.5194/egusphere-egu23-8629, 2023.

EGU23-8860 | ECS | Orals | EMRP1.6

Effective seismic properties of fractured rocks: the role played by fracture scaling characteristics

Gabriel Quiroga, Santiago Solazzi, Nicolás Barbosa, J. Germán Rubino, Marco Favino, and Klaus Holliger

The seismic characterization of fractured geological formations is of importance for a wide range of applications throughout the Earth, environmental and engineering sciences, such as, for example, hydrocarbon exploration and production, CO₂sequestration, monitoring of enhanced geothermal reservoirs, nuclear waste storage, and tunneling operations. Seismic methods are indirect in nature, and, hence, comprehensive modelling techniques are required to translate corresponding observations into rock physical properties. In this regard, numerous works have employed the theoretical framework of poroelasticity in order to explore the seismic response of particularly complex and elusive parameters of fluid-saturated fracture networks, such as their fracture density and interconnectivity. This is motivated by the fact that poroelasticity allows to account for fluid pressure diffusion effects between connected fractures as well as between fractures and their embedding background. Fluid pressure diffusion prevails when zones of contrasting compliance are traversed by a seismic wave, as this results in pressure gradients, which induce oscillatory fluid flow and, consequently, energy dissipation. This form of energy dissipation has a significant impact on seismic velocity dispersion, attenuation, and anisotropic characteristics, which are key seismic observables. While a wide range of approximations are employed to represent fracture properties in order to compute the seismic response of formations, they do tend to inherently ignore the complex interrelationships between the lengths, compliances, apertures, and permeabilities of fractures remains, as of yet, unaccounted for. In this work, we seek to alleviate this in combination with a poroelastic modelling approach to explore how length-dependent fracture scaling characteristics affect the effective seismic properties of fractured rocks. We start by revisiting canonical models with two orthogonally intersecting fractures of different lengths to analyze the interactions occurring when fractures are affected by a seismic wavefield. We then proceed to explore how scaling relations affect these results. Finally, we consider fracture networks with realistic stochastic length distributions, for which we compare the effective seismic response with and without the proposed length-dependent scaling of the fracture characteristics. Our results demonstrate that the scaling of fracture properties does indeed have a significant effect on the seismic response, as it dramatically reduces the contribution of smaller fractures to fluid pressure diffusion between connected fractures, which, in turn, affects the overall seismic characteristics of the formation.

How to cite: Quiroga, G., Solazzi, S., Barbosa, N., Rubino, J. G., Favino, M., and Holliger, K.: Effective seismic properties of fractured rocks: the role played by fracture scaling characteristics, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8860, https://doi.org/10.5194/egusphere-egu23-8860, 2023.

EGU23-11037 | Orals | EMRP1.6

Mechanical Earth Modelling for Petroleum Reservoir in Western Offshore India: Tensile Failure Study

Sarada Prasad Pradhan and Krishna Chandra Sundli

Quantifying in-situ stress is crucial for predicting drilling-induced tensile fractures, wellbore failures, proper well placement, hydro-fracture treatment optimization and sand production. A comprehensive mechanical earth model incorporating pore pressure, stress state, and rock mechanical properties enable us to study the cause of failure observed in the well. The study is focused on a petroleum reservoir in Western Offshore India. In this study, an attempt is made to estimate in-situ stresses present in the field. Well-log data calibrated with available direct pressure measurements viz. Modular Dynamic Test (MDT) and Leak Off Test (LOT) data are used to predict the pore pressure and minimum horizontal stress. Vertical stress is estimated by extrapolating the density log; for minimum and maximum horizontal stress, the poroelastic approach is adopted. Key rock strength parameters were estimated using standard correlations and regional studies. Wellbore stability analysis was carried out, and the results were calibrated with the actual mud weight used. Natural fractures present in the reservoir are sensitive to stress distribution which in turn is sensitive to changes in pore pressure distribution. Many exploratory and development wells have been drilled in the area, but very few have recorded DSI (Dipole Shear Sonic Image) and FMI (Formation Micro-Scanner Image) logs. With the available log data, the study was carried out to quantify the rock mechanical parameters and the stress magnitudes of the field. The study aims to model the study area's geomechanical aspect for better prediction of drilling-induced challenges, thereby reducing NPT and optimizing drainage.

How to cite: Pradhan, S. P. and Sundli, K. C.: Mechanical Earth Modelling for Petroleum Reservoir in Western Offshore India: Tensile Failure Study, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11037, https://doi.org/10.5194/egusphere-egu23-11037, 2023.

EGU23-16816 | ECS | Posters on site | EMRP1.6

Correlating Seismic Wave Velocities with the Physicomechanical Properties of Carbonate Rocks

Anamika Sahu, Sandeep Singh, Narendra Kumar Samadhiya, and Anand Joshi

Seismic wave velocities (both P and S waves) measurements have been carried out for carbonate rock samples collected from Lesser Himalayan deposits exposed along Alaknanda valley between Rudraprayag to Helang village in Uttarakhand, India. This study has been carried out to evaluate the effect of the petrophysical and the mechanical properties of rocks on seismic wave velocities. On the core samples, petrophysical and mechanical measurements were performed where porosity, density, water absorption, and seismic wave velocities were first determined, followed by measuring the uniaxial compressive strength (UCS), and Brazilian tensile strength (BTS). Thin sections were prepared to measure the petrographic parameters (textural properties and mineralogical composition). This study focuses mainly on grain size and mineral composition. Petrographic investigation and X-ray diffraction (XRD) analysis were done to identify their mineralogy. Both petrographic and XRD analysis revealed that the main constituting minerals are dolomite, and in minor amounts, calcite, quartz, and opaque minerals are present. Interrelationships between seismic wave velocities and porosity, density, mineral constituents, grain size, uniaxial compressive strength, and Brazilian tensile strength were obtained using regression analysis. It has been concluded that there are significant positive correlations between compressional wave velocity and uniaxial compressive strength (r² = 0.82), Brazilian tensile strength (r² = 0.67). Similarly, strong to moderate correlations were found between shear wave velocity and uniaxial compressive strength (r² = 0.73), Brazilian tensile strength (r² = 0.68). Weak to moderate negative correlations were found between seismic wave velocities and porosity. Moderate positive correlations have been found between seismic wave velocities and dry density. There is moderate negative correlation has been found between uniaxial compressive strength and grain size. Furthermore, it has also been concluded that the influence of grain size on rock strength was more important than mineral content.

How to cite: Sahu, A., Singh, S., Kumar Samadhiya, N., and Joshi, A.: Correlating Seismic Wave Velocities with the Physicomechanical Properties of Carbonate Rocks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16816, https://doi.org/10.5194/egusphere-egu23-16816, 2023.

EGU23-1562 | ECS | Posters virtual | GI2.1

A new finite-difference stress modeling method governed by elastic wave equations

Zhuo Fan, Fei Cheng, and Jiangping Liu

Numerical stress or strain modeling has been a focused subject in many fields, especially in assessing the stability of key engineering structures and better understanding in local or tectonic stress patters and seismicity. Here we proposed a new stress modeling method governed by elastic wave equations using finite-difference scheme. Based on the modeling scheme of wave propagation, the proposed method is able to solve both the dynamic stress evolution and the static stress state of equilibrium by introducing an artificial damping factor to the particle velocity. We validate the proposed method in three geophysical benchmarks: (a) a layered earth model under gravitational load, (b) a rock mass model under nonuniform loads on its exterior boundaries and (c) a fault zone with strain localization driven by regional tectonic loading that measured by GPS velocity field. Because the governing equations of the proposed method are wave equations instead of equilibrium equations, we are able to use the perfectly matched layer as the artificial boundary conditions for models in unbounded domain, which will substantially improve the accuracy of them. Also, the proposed scheme maps the physical model on simple computational grids and therefore is more memory efficient for grid points’ positions not been stored. Besides, the efficient parallel computing of the finite-different method guarantees the proposed method’s advantage in computational speed. As a minor modification to wave modeling scheme, the proposed stress modeling method is not only accurate for geological models through different scales, but also physically reasonable and easy to implement for geophysicists.

How to cite: Fan, Z., Cheng, F., and Liu, J.: A new finite-difference stress modeling method governed by elastic wave equations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1562, https://doi.org/10.5194/egusphere-egu23-1562, 2023.

EGU23-2228 | ECS | Posters on site | GI2.1

Non-destructive geophysical damage analysis of medieval plaster in the cloister of the St. Petri Cathedral Schleswig (Germany)

Yunus Esel, Ercan Erkul, Detlef Schulte-Kortnack, Christian Leonhardt, Julika Heller, and Thomas Meier

Buildings that have existed for centuries undergo structural changes over time due to variations in use. In addition, many structures are severely damaged for example by moisture intrusion. To determine the distribution of moisture in the structure, they are often examined pointwise by core sampling. In addition to invasive methods, non-destructive methods may be applied to obtain three-dimensional hints on the moisture distribution with structures of interest.
The purpose of this paper is to show that non-destructive determination of moisture distribution is possible by using and combining geophysical measurement methods such as infrared thermography (IR), ultrasound (US) and ground penetrating radar (GPR). There are examples for the combination of these methods for non-destructive examination, but it is not yet commonly applied in the field of restoration and conservation of historic buildings.
We present results of geophysical investigations of medieval wall paintings in the cloister of the cathedral in Schleswig (Federal State Schleswig-Holstein, Northern Germany) in the framework of a project funded by the German Federal Foundation for the Environment (Deutsche Bundesstiftung Umwelt - DBU). In the cloister, large-scale alterations of the medieval red-line paintings occurred due to gypsum deposits and a shellac coating. In order to quantify the material properties of a vault section (yoke) in the cloister during the restoration ultrasound surface wave measurements, passive and active thermography and ground penetrating radar measurements were carried out.
Repeating measurements at intervals of several months made it possible to evaluate the effectiveness of the test treatments by different solvents to remove the shellac as well as the gypsum deposits. In addition, our results from the passive thermography measurements show that in one section a defect in the horizontal barrier could be responsible for moisture ingress and associated damage. The radargrams recorded in this area confirm that a significant change in reflection amplitudes is present in the areas of increased moisture.

How to cite: Esel, Y., Erkul, E., Schulte-Kortnack, D., Leonhardt, C., Heller, J., and Meier, T.: Non-destructive geophysical damage analysis of medieval plaster in the cloister of the St. Petri Cathedral Schleswig (Germany), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2228, https://doi.org/10.5194/egusphere-egu23-2228, 2023.

EGU23-2347 | ECS | Posters on site | GI2.1

Non-destructive testing methods and numerical development for enhancing airfield pavement management

Konstantinos Gkyrtis, Christina Plati, and Andreas Loizos

Pavements are an essential component of airport facilities. Airport infrastructures serve to safely transport people and goods on a day-to-day basis. They promote economic development, both regionally and internationally, by also boosting tourist flows. In times of crisis, they can be used for societal emergencies, such as managing migration flows. Therefore, airports need pavements in good physical condition to ensure uninterrupted operations. However, interventions on airfield pavements are costly and labor intensive. Aspects of pavement structural performance related to bearing capacity and damage potential remain of paramount importance as the service life of a pavement extends beyond its design life. Therefore, structural condition evaluation is required to ensure the long-term bearing capacity of the pavement.

The design and evaluation of flexible airfield pavements are generally based on the Multi-Layered Elastic Theory (MLET) in accordance with Federal Aviation Administration (FAA) principles. The most informative tool for structural evaluation is the Falling Weight Deflectometer (FWD), which senses pavement surfaces using geophones that record load-induced deflections at various locations. Additional geophysical inspection data using Ground Penetrating Radar (GRP) is processed to estimate the stratigraphy of the pavement. The integration of the above data provides an estimate of the pavement's performance and potential for damage. However, GRP is not always readily applicable.

In addition, the most important concern in pavement evaluation is the mechanical characterization of pavement materials. At the top of pavement structures, asphalt mixtures behave as a function of temperature and loading frequency. This viscoelastic behavior deviates from MLET and this issue needs further investigation. Therefore, this study integrates measured NDT data and sample data from cores taken in-situ. The pavement under study is an existing asphalt pavement of a runway at a regional airport in Southern Europe. A comparative evaluation of the strain state within the pavement body is performed both at critical locations and at the pavement surface, taking into account elastic and viscoelastic behaviors. Strains are an important input to models of long-term pavement performance, which has a critical influence on aircraft maneuverability. In turn, the significant discrepancies found highlight the need for more mechanistic considerations in predicting the damage and stress potential of airfield pavements so that maintenance and/or rehabilitation needs can be better managed and planned.

Overall, this study highlights the sensing capabilities of NDT data towards a structural health monitoring of airfield pavements. Ground-truth data from limited destructive testing enrich pavement evaluation processes and enhance conventional FAA evaluation procedures. The study proposes a numerical development for accurate field inspections and improved monitoring protocols for the benefit of airfield pavement management and rehabilitation planning.

How to cite: Gkyrtis, K., Plati, C., and Loizos, A.: Non-destructive testing methods and numerical development for enhancing airfield pavement management, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2347, https://doi.org/10.5194/egusphere-egu23-2347, 2023.

EGU23-2869 | Orals | GI2.1

Pleistocene/Holocene (P/H) boundary oceanic Koefels-comet Impact Series Scenario (KISS) of 12.850 yr BP Global-warming Threshold Triad (GTT)-Part III

Michael Bujatti-Narbeshuber

The Laacher See Event- (LSE-) volcanism isochrone of 12.850 yrs BP (Bujatti-Narbeshuber, 1997), proxy for P/H boundary KISS (Bujatti-Narbeshuber, 1996), was improved from Gerzensee varves to 13.034 cal yrs BP (Van Raden, 2019).

This LSE date now separates end Pleistocene, first, mainly oceanic-water KISS, from the second, Holocene-Younger Dryas Onset (YDO), continental-ice impact, as predicted by KISS-hypothesis, separating:„ a continental Koefels-comet ice-impact, from the mainly oceanic KISS, at the Pleistocene/Holocene boundary, associated with global warming, dendro C14 spikes, faunal mass extinction...“ (Bujatti-Narbeshuber, 1996; Max, 2022).

Oceanic-water LSE-KISS (13.034 cal yrs BP, varves) of end Alleroed temperature maximum, separates by 157 yrs from continental-ice YDO-KISS (12.877 cal yrs BP, varve-date). A larger gap of 184 yrs results, taking C 14 dated YD-KISS (12.850 cal yrs BP), approaching 200 yrs of earlier varve-studies (Bujatti-Narbeshuber, 1997).

LSE-KISS varve-date differs by 47 yrs from geo-magnetic Gothenberg Excursion Onset- (GEO-) isochrone of 13.081 cal yrs BP (Chen, 2020), suggesting geo-magnetic reversal, True Polar Wander (TPW) GEO-TPW-KISS from 2 Koefels-comet (Taurid-) fragments. This considers end-paleolithic Magdalenian Impact Sequelae Symbolisations (MISS).

Questioning P/H isostatic-unloading volcanism (Zielinsky, 1996), LSE-KISS volcanism is from Mid Atlantic Ridge & Mid Atlantic Plateau (MAR&MAP) impact (Bujatti-Narbeshuber, 1997, 2022), as further corroborated by Greenland (NGRIP) ice-core sulfate monitoring: from LSE-KISS-volcanism (12.978 cal yrs) to YDO (12.867 cal yr BP), within 110 yrs, an unprecedented, bipolar-volcanic-eruption-quadruplet resulted (Lin, 2022).

The first Taurid LSE-KISS (Varves-date: 13.034 cal yrs BP, GEO-date: 13.084 cal yrs BP.) into oceanic-water is evident from two 700 km Mid Atlantic Ridge & Plateau Lowering Events (MARPLES) releasing two separate Tsunamis (Bujatti-Narbeshuber, 2022): Resulting in submarine explosive-magmatism-silicates, seafloor-carbonates, volcanic-ash and sea-water in huge strato-meso-spheric overheated steam-plume moving eastward by eolian transport, descending in drowning rain-flood, largely contributing to Eurasian loess sediment layer (Muck, 1976).

This is stratigraphically verified in e.g. relative stratigraphic positions in Netherland, Geldrop-Aalsterhut, with Younger Coversand I, bleached (!) (AMS 13.080- 12.915 cal yrs BP) underlying intercalated (!), charcoal rich (AMS 12.785-12.650 cal yrs BP) Usselo Horizon (Andronikov, 2016). It corresponds to US, Black Mats stratigraphy from second Taurid, continental-ice, YD-KISS (12.850 cal yrs BP, C14) plus Carolina Bays (CB) with: 1. Soft, white, loess sediment from first oceanic LSE-KISS. 2. YD-KISS proxies-stratum. 3. e.g. Carolina-Florida-coast-sand-disturbances, within 1.500 km radius of continental-ice YD-KISS ice-ejecta impact-curtain of 500.000 CB (LIDAR) 4. Black Mats after YD-KISS.

After visiting Koefels-crater an “below continental-glacier-ice, circular geomagnetic-anomaly with paleoseismic Koefels-corridor of twelfe Holocene rockfalls”, Eugene Shoemaker (Vienna, May 5^th1997), when asked about Carolina Bays causation, is quoted: “Eugene spoke of a late Pleistocene origin of the Bays and as glaciological features while I preferred the paleoseismic interpretation. I interprete them as paleoseismic impact-seismic liquefaction features. They … are the first evidence for a late Pleistocene impact event. Dated by me …12.850 BP (1950) in calendar years”. (Bujatti-Narbeshuber, NHM letter to John Grant III, Sept. 22^nd 1997).

Both P/H-impacts break&make, Pleistocene criticality&Holocene damped flow, through 700 km geomorphological threshold (GLOVES) submersion & through (GTT) water, CO2 Greenhouse-gas-production, beyond glaciation threshold for hot climate prediction.

How to cite: Bujatti-Narbeshuber, M.: Pleistocene/Holocene (P/H) boundary oceanic Koefels-comet Impact Series Scenario (KISS) of 12.850 yr BP Global-warming Threshold Triad (GTT)-Part III, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2869, https://doi.org/10.5194/egusphere-egu23-2869, 2023.

EGU23-2980 | Posters on site | GI2.1

Numerical modelling of seismic field record with elastic velocity construction for CO2 sequestration in offshore, South Korea

Snons Cheong, Moohee Kang, and Kyoung Jin Kim

To evaluate the feasibility of CO2 sequestration in offshore, South Korea, we studied numerical modelling with elastic velocity model. The CO2 storage candidate is a brine saturated aquifer formation overlain by basalt caprock in the Southern Continental Shelf of Korea. Basalt formation without joint and fracture can seal a storage volume preventing leakage of injected CO2. Result of preliminary two-dimensional seismic exploration estimated that storage potential would be from 42.07 to 143.79 Mt of CO2. The input model include P- and S-wave velocity and density of shallow sediment and vasalt layer. To simulate CO2 injection, we assumed an area of CO2 plume at the interval beneath the depth of basalt formation and artificially decreased P-, S-wave velocities, and density values. Synthesized seismic records are comparable with survey's gather as direct arrival and primary reflections. The ongoing work can be extended on a quantitative verification concerning serveral cased of varying velcoties and densities.

How to cite: Cheong, S., Kang, M., and Kim, K. J.: Numerical modelling of seismic field record with elastic velocity construction for CO2 sequestration in offshore, South Korea, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2980, https://doi.org/10.5194/egusphere-egu23-2980, 2023.

EGU23-4861 | Orals | GI2.1

Decay diagnosis of tree trunks using 3D point cloud and reverse time migration of GPR data

Zhijie Chen, Hai Liu, Meng Xu, Yunpeng Yue, and Bin Zhang

Health monitoring and disease mitigation of trees are essential to ensure the sustainability of wood industry, safety of ecosystems, and maintenance of climatic conditions. Several non-destructive testing methods have been applied to monitor and detect the decays inside the trunks. Among them, ground penetrating radar (GPR) has gained recognition due to its high efficiency and good resolution. However, due to the wide beam width of the antenna pattern and the complicated scattering caused by the trunk structure, the recorded GPR profile is far from the actual geometry of the tree trunk. Moreover, the irregular contour of the tree trunk makes traditional data processing algorithms difficult to be performed. Therefore, an efficient migration algorithm with high resolution, as well as a high accuracy survey-line positioning method for curved contour of the trunk should be developed.

In this paper, a combined approach is proposed to image the inner structures inside the irregular-shaped trunks. In the first step, the 3D contour of the targeted tree trunk is built up by a 3D point cloud technique via photographing around the trunk at various angles. Subsequently, the 2D irregular contour of the cross-section of trunk at the position of the GPR survey line is extracted by the Canny edge detection method to locate the accurate position of each GPR A-scans [1]. Thirdly, the raw GPR profile is pre-processed to suppress undesired noise and clutters. Then, an RTM algorithm based on the zero-time imaging condition is applied for image reconstruction using the extracted 2D contour [2]. Lastly, a denoising method based on the total variation (TV) regularization is applied for artifact suppression in the reconstructed images [3].

Numerical, laboratory and field experiments are carried out to validate the applicability of the proposed approach. Both numerical and laboratory experimental results show that the RTM can yield more accurate and higher resolution images of the inner structures of the tree cross section than the BP algorithm. The proposed approach is further applied to a diseased camphor tree, and an elliptical decay defect is found the in the migrated GPR image. The results are validated by a visual inspection after the tree trunk was sawed down.

Fig. 1 Field experiment. (a) Geometric reconstruction result using point cloud data, (b) migrated result by the RTM algorithm and (c) bottom view of the tree trunk after sawing down. The red and yellow ellipses indicate the cavity and the decay region in the trunk, respectively.

References:

[1] Canny, "A Computational Approach to edge detection," IEEE Transactions on Pattern Analysis and Machine Interllgent, vol. PAMI-8, no. 6, pp. 679-698, 1986, doi: 10.1109/TPAMI.1986.4767851.

[2] S. Chattopadhyay and G. A. McMechan, "Imaging conditions for prestack reverse-time migration," Geophysics, vol. 73, no. 3, pp. S81-S89, 2008, doi: 10.1190/1.2903822.

[3] L. I. Rudin, S. Osher, and E. Fatemi, "Nonlinear total variation based noise removal algorithms," Physica D, vol. 60, pp. 259-268, 1992, doi: 10.1016/0167-2789(92)90242-F.

How to cite: Chen, Z., Liu, H., Xu, M., Yue, Y., and Zhang, B.: Decay diagnosis of tree trunks using 3D point cloud and reverse time migration of GPR data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4861, https://doi.org/10.5194/egusphere-egu23-4861, 2023.

EGU23-6795 | ECS | Orals | GI2.1

Relaxing requirements for spatio-temporal data fusion

Harkaitz Goyena, Unai Pérez-Goya, Manuel Montesino-San Martín, Ana F. Militino, Peter M. Atkinson, and M. Dolores Ugarte

Satellite sensors need to make a trade-off between revisit frequency and spatial resolution. This work presents a spatio-temporal image fusion method called Unpaired Spatio-Temporal Fusion of Image Patches (USTFIP). This method combines data from different multispectral sensors and creates images combining the best of each satellite in terms of frequency and resolution. It generates synthetic images and selects optimal information from cloud-contaminated images, to avoid the need of cloud-free matching pairs of satellite images. The removal of this restriction makes it easier to run our fusion algorithm even in the presence of clouds, which are frequent in time series of satellite images. The increasing demand of larger datasets makes necessary the use of computationally optimized methods. Therefore, this method is programmed to run in parallel reducing the run-time with regard to other methods. USTFIP is tested through an experimental scenario with similar procedures as Fit-FC, STARFM and FSDAF. Finally, USTFIP is the most robust, since its prediction accuracy deprecates at a much lower rate as classical requirements become progressively difficult to meet.

How to cite: Goyena, H., Pérez-Goya, U., Montesino-San Martín, M., F. Militino, A., Atkinson, P. M., and Ugarte, M. D.: Relaxing requirements for spatio-temporal data fusion, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6795, https://doi.org/10.5194/egusphere-egu23-6795, 2023.

EGU23-6908 | Orals | GI2.1

Low-cost assessment and visualization of tree roots using smartphone LiDAR, Ground-Penetrating Radar (GPR) data and virtual reality

Stephen Uzor, Livia Lantini, and Fabio Tosti

Continual monitoring of tree roots, which is essential when considering tree health and safety, is possible using a digital model. Non-destructive techniques, for instance, laser scanning, acoustics, and Ground Penetrating Radar (GPR) have been used in the past to study both the external and internal physical dimensions of objects and structures [1], including trees [2,3]. Recent studies have shown that GPR is effective in mapping the root system's network in street trees [3]. Light Detection and Ranging (LiDAR) technology has also been employed in infrastructure management to generate 3D data and to detect surface displacements with millimeter accuracy [4]. However, scanning such structures using current state-of-the-art technologies can be expensive and time consuming. Further, continual monitoring of tree roots requires multiple visits to tree sites and, oftentimes, repeated excavations of soil.

This work proposes a Virtual Reality (VR) system using smartphone-based LiDAR and GPR data to capture ground surface and subsurface information to monitor the location of tree roots. Both datasets can be visualized in 3D in a VR environment for future assessment. LiDAR technology has recently become available in smartphones (for instance, the Apple iPhone 12+) and can scan a surface, e.g., the base of a tree, and export the data to a 3D modelling and visualization application. Using GPR data, we combined subsurface information on the location of tree roots with the LiDAR scan to provide a holistic digital model of the physical site. The system can provide a relatively low-cost environmental modelling and assessment solution, which will allow researchers and environmental professionals to a) create digital 3D snapshots of a physical site for later assessment, b) track positional data on existing tree roots, and c) inform the decision-making process regarding locations for potential future excavations.

Acknowledgments: Sincere thanks to the following for their support: Lord Faringdon Charitable Trust, The Schroder Foundation, Cazenove Charitable Trust, Ernest Cook Trust, Sir Henry Keswick, Ian Bond, P. F. Charitable Trust, Prospect Investment Management Limited, The Adrian Swire Charitable Trust, The John Swire 1989 Charitable Trust, The Sackler Trust, The Tanlaw Foundation, and The Wyfold Charitable Trust. The Authors would also like to thank Mr Dale Mortimer (representing the Ealing Council) and the Walpole Park for facilitating this research.

References

[1] Alani A. M. et al., Non-destructive assessment of a historic masonry arch bridge using ground penetrating radar and 3D laser scanner. IMEKO International Conference on Metrology for Archaeology and Cultural Heritage Lecce, Italy, October 23-25, 2017.

[2] Ježová, J., Mertens, L., Lambot, S., 2016. “Ground-penetrating radar for observing tree trunks and other cylindrical objects,” Construction and Building Materials (123), 214-225.

[3] Lantini, L., Alani, A. M., Giannakis, I., Benedetto, A. and Tosti, F., 2020. "Application of ground penetrating radar for mapping tree root system architecture and mass density of street trees," Advances in Transportation Studies (3), 51-62.

[4] Lee, J. et al., Long-term displacement measurement of bridges using a LiDAR system. Struct Control Health Monit. 2019; 26:e2428.

How to cite: Uzor, S., Lantini, L., and Tosti, F.: Low-cost assessment and visualization of tree roots using smartphone LiDAR, Ground-Penetrating Radar (GPR) data and virtual reality, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6908, https://doi.org/10.5194/egusphere-egu23-6908, 2023.

EGU23-8384 | ECS | Orals | GI2.1

A Study on the Effect of Target Orientation on the GPR Detection of Tree Roots Using a Deep Learning Approach

Livia Lantini, Federica Massimi, Saeed Sotoudeh, Dale Mortimer, Francesco Benedetto, and Fabio Tosti

Monitoring and protection of natural resources have grown increasingly important in recent years, since the effect of emerging illnesses has caused serious concerns among environmentalists and communities. In this regard, tree roots are one of the most crucial and fragile plant organs, as well as one of the most difficult to assess [1].

Within this context, ground penetrating radar (GPR) applications have shown to be precise and effective for investigating and mapping tree roots [2]. Furthermore, in order to overcome limitations arising from natural soil heterogeneity, a recent study has proven the feasibility of deep learning image-based detection and classification methods applied to the GPR investigation of tree roots [3].

The present research proposes an analysis of the effect of root orientation on the GPR detection of tree root systems. To this end, a dedicated survey methodology was developed for compilation of a database of isolated roots. A set of GPR data was collected with different incidence angles with respect to each investigated root. The GPR signal is then processed in both temporal and frequency domains to filter out existing noise-related information and obtain spectrograms (i.e. a visual representation of a signal's frequency spectrum relative to time). Subsequently, an image-based deep learning framework is implemented, and its performance in recognising outputs with different incidence angles is compared to traditional machine learning classifiers. The preliminary results of this research demonstrate the potential of the proposed approach and pave the way for the use of novel ways to enhance the interpretation of tree root systems.

Acknowledgements

The Authors would like to express their sincere thanks and gratitude to the following trusts, charities, organisations and individuals for their generosity in supporting this project: Lord Faringdon Charitable Trust, The Schroder Foundation, Cazenove Charitable Trust, Ernest Cook Trust, Sir Henry Keswick, Ian Bond, P. F. Charitable Trust, Prospect Investment Management Limited, The Adrian Swire Charitable Trust, The John Swire 1989 Charitable Trust, The Sackler Trust, The Tanlaw Foundation, and The Wyfold Charitable Trust. The Authors would also like to thank the Ealing Council and the Walpole Park for facilitating this research.

References

[1] Innes, J. L., 1993. Forest health: its assessment and status. CAB International.

[2] Lantini, L., Tosti, F., Giannakis, I., Zou, L., Benedetto, A. and Alani, A. M., 2020. "An Enhanced Data Processing Framework for Mapping Tree Root Systems Using Ground Penetrating Radar," Remote Sensing 12(20), 3417.

[3] Lantini, L., Massimi, F., Tosti, F., Alani, A. M. and Benedetto, F. "A Deep Learning Approach for Tree Root Detection using GPR Spectrogram Imagery," 2022 45th International Conference on Telecommunications and Signal Processing (TSP), 2022, pp. 391-394.

How to cite: Lantini, L., Massimi, F., Sotoudeh, S., Mortimer, D., Benedetto, F., and Tosti, F.: A Study on the Effect of Target Orientation on the GPR Detection of Tree Roots Using a Deep Learning Approach, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8384, https://doi.org/10.5194/egusphere-egu23-8384, 2023.

EGU23-8667 | ECS | Posters on site | GI2.1

An Investigation into the Acquisition Parameters for GB-SAR Assessment of Bridge Structural Components

Saeed Sotoudeh, Livia Lantini, Kevin Jagadissen Munisami, Amir M. Alani, and Fabio Tosti

Structural health monitoring (SHM) is a necessary measure to maintain bridge infrastructure safe. To this purpose, remote sensing has proven effective in acquiring data with high accuracy in relatively short time. Amongst the available methods, the ground-based synthetic aperture radar (GB-SAR) can detect sub-zero deflections up to 0.01 mm generated by moving vehicles or the environmental excitation of the bridges [1]. Interferometric radars are also capable of data collection regardless of weather, day, and night conditions (Alba et al., 2008). However, from the available literature - there is lack of studies and methods focusing on the actual capabilities of the GB-SAR to target specific structural elements and components of the bridge - which makes it difficult to associate the measured deflection with the actual bridge section. According to the antenna type, the footprint of the radar signal gets wider in distance which encompasses more elements and the presence of multiple targets in the same resolution cell adds uncertainty to the acquired data (Michel & Keller, 2021). To this effect, the purpose of the present research is to introduce a methodology for pinpointing targets using GB-SAR and aid the data interpretation. An experimental procedure is devised to control acquisition parameters and targets, and being able to analyse the returned outputs in a more clinical condition. The outcome of this research will add to the existing literature in terms of collecting data with enhanced precision and certainty.

Keywords

Structural Health Monitoring (SHM), GB-SAR, Remote Sensing, Interferometric Radar

Acknowledgements

This research was funded by the Vice-Chancellor’s PhD Scholarship at the University of West London.

References

[1] Benedettini, F., & Gentile, C. (2011). Operational modal testing and FE model tuning of a cable-stayed bridge. Engineering Structures, 33(6), 2063-2073.

[2] Alba, M., Bernardini, G., Giussani, A., Ricci, P. P., Roncoroni, F., Scaioni, M., Valgoi, P., & Zhang, K. (2008). Measurement of dam deformations by terrestrial interferometric techniques. Int.Arch.Photogramm.Remote Sens.Spat.Inf.Sci, 37(B1), 133-139.

[3] Michel, C., & Keller, S. (2021). Advancing ground-based radar processing for bridge infrastructure monitoring. Sensors, 21(6), 2172.

How to cite: Sotoudeh, S., Lantini, L., Munisami, K. J., Alani, A. M., and Tosti, F.: An Investigation into the Acquisition Parameters for GB-SAR Assessment of Bridge Structural Components, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8667, https://doi.org/10.5194/egusphere-egu23-8667, 2023.

EGU23-8762 | ECS | Orals | GI2.1

Joint Interpretation of Multi-Frequency Ground Penetrating Radar and Ultrasound Data for Mapping Cracks and Cavities in Tree Trunks

Saeed Parnow, Livia Lantini, Stephen Uzor, Amir M. Alani, and Fabio Tosti

As the Earth's lungs, trees are a natural resource that provide, amongst others, food, lumber, and oxygen. Therefore, monitoring these wooden structures with non-destructive testing (NDT) techniques such as ground penetrating radar (GPR) and ultrasound can provide valuable information about inner flaws and decays, which is an essential step for tree conservation.

In recent years, GPR and ultrasound have been used to delineate the interior architecture of tree trunks [1-3]. However, more research is required to improve results and consequently have a more reliable interpretation. Due to limitations in depth penetration and signal-to-noise ratio [4], these approaches have a limited capacity for resolving features. The use of gain functions and higher frequencies to compensate for wave attenuation may exaggerate events and reduce resolution, respectively.

In this context, an integration between GPR multi-frequency and ultrasound data can be used to address this issue. Data were collected on a tree trunk log at the Faringdon Centre for Non-Destructive Testing and Remote Sensing using two high-frequency GPR systems (2GHz and 4GHz central frequencies) and an ultrasound (supporting a wide range of transducers from 24 kHz up to 500 kHz) testing equipment. Internal features of interest in terms of extended perimetric air gaps at the bark-wood interface, natural cracks and small artificial cavities were investigated through electromagnetic and mechanical waves. After compilation of data, a joint interpretation strategy for data analysis is developed. The processed data were mapped against the cut sections of the tree for validity purposes.

Although study of stand tree trunks would be more challenging, the findings of this research may be applied for wood timbers and pave the way to future research for living tree trunks.

Acknowledgements

This research was funded by the Vice-Chancellor’s PhD Scholarship at the University of West London.

References

[1] Arciniegas, A., et al., Literature review of acoustic and ultrasonic tomography in standing trees. Trees, 2014. 28(6): p. 1559-1567.

[2] Giannakis, I., et al., Health monitoring of tree trunks using ground penetrating radar. IEEE Transactions on Geoscience and Remote Sensing, 2019. 57(10): p. 8317-8326.

[3] Espinosa, L., et al., Ultrasound computed tomography on standing trees: accounting for wood anisotropy permits a more accurate detection of defects. Annals of Forest Science, 2020. 77(3): p. 1-13.

[4] Tosti, F., et al., The use of GPR and microwave tomography for the assessment of the internal structure of hollow trees. IEEE Transactions on Geoscience and Remote Sensing, 2021. 60: p. 1-14.

How to cite: Parnow, S., Lantini, L., Uzor, S., Alani, A. M., and Tosti, F.: Joint Interpretation of Multi-Frequency Ground Penetrating Radar and Ultrasound Data for Mapping Cracks and Cavities in Tree Trunks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8762, https://doi.org/10.5194/egusphere-egu23-8762, 2023.

EGU23-10874 | ECS | Orals | GI2.1

Ground subsidence risk mapping and assessment along Shanghai metro lines by PS-InSAR and LightGBM

Long Chai, Xiongyao Xie, Biao Zhou, and Li Zeng

Ground subsidence is a typical geological hazard in urban areas. It endangers the safety of infrastructures, such as subways. In this study, the ground subsidence risk of Shanghai metro lines was mapped and assessed. Firstly, PS-InSAR was used for the ground subsidence survey, and subsidence intensity was divided into five classes according to subsidence velocity. 10 subsidence causal factors were collected and the frequency ratio method was applied to analyze the correlation between subsidence and its causal factors. Then LightGBM model was used to generate a ground subsidence susceptibility map. And receiver operating characteristic curve and area under the curve (AUC) were adopted to assess the model. And AUC is 0.904, which suggests the model's performance is excellent. Finally, a risk matrix was introduced to consider the intensity and susceptibility of ground subsidence. The risk of ground subsidence was mapped and classified into five levels: R1 (very low), R2 (low), R3 (medium), R4 (high), and R5 (very high). The results showed that the risk of subway ground subsidence exhibited a regional-related characteristic. Metro lines located in areas with higher ground subsidence risk levels also had higher ground subsidence risk levels. Meanwhile, the statistical results of subway ground subsidence risk levels showed that subway stations were safer than sections.

How to cite: Chai, L., Xie, X., Zhou, B., and Zeng, L.: Ground subsidence risk mapping and assessment along Shanghai metro lines by PS-InSAR and LightGBM, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10874, https://doi.org/10.5194/egusphere-egu23-10874, 2023.

EGU23-12226 | ECS | Orals | GI2.1

Evaluation of Spectral Mixing Techniques for Geological Mixture in a Laboratory Setup: Insights on the nature of mixing

Maitreya Mohan Sahoo, Kalimuthu Rajendran, Arun Pattathal Vijayakumar, Shibu K. Mathew, and Alok Porwal

Geological mixtures having endmembers mixed at a fine scale pose a challenge to estimating their fractional abundances. Light incident on these mixtures interacts both at multilayered and surface levels, resulting in volumetric and albedo scattering, respectively. Accounting for these effects necessitates a nonlinear spectral mixing model approach rather than conventional linear mixing. In this study, we evaluate the performances of linear and various nonlinear spectral mixing models for an intimately mixed geological mixture, i.e., a banded hematite quartzite (BHQ) sample. The BHQ sample with distinct endmembers of hematite and quartzite facilitated our study of the behavior of light on two-component nonlinear mixtures. In a laboratory-based experimental setup, we used a spectroradiometer of full spectral range in the visible and near-infrared regions (350 to 2500nm) to acquire a hyperspectral image of the BHQ sample. It was followed by the identification of nonlinearly mixed regions and inferring changes in their spectral features. The nonlinearity induced in these regions was attributed to two significant causes- (1) the fine scale of spectral mixing and (2) the spectroradiometer sensor’s limited ability to spatially distinguish between focused and neighboring points, thereby producing a point spread effect. We observed the effects of nonlinear spectral mixing for our sample by changing the sensor’s height from 1mm to 5mm, to simulate fine and coarse-resolution images, respectively. The spectral mixing was modeled using the existing mapped ground truth fractional abundances and library endmembers’ spectra by linear mixing and established nonlinear techniques of the generalized bilinear model (GBM), polynomial post-nonlinear model (PPNM), kernel-based support vector machines (k-SVMs). The evaluated performance metric of reconstruction error revealed the nonlinearity effect in image pixels through statistical tests and nonlinearity parameters used in these models. It was further observed that the associated nonlinearity increases from fine to coarse-resolution images. The minimum error of image reconstruction was observed for the polynomial post-nonlinear model, with a single nonlinearity parameter and an average reconstruction error (ARE) of 0.05. Our study provided insights into the nature of nonlinear mixing with endmember composition and particle sizes.

How to cite: Sahoo, M. M., Rajendran, K., Pattathal Vijayakumar, A., Mathew, S. K., and Porwal, A.: Evaluation of Spectral Mixing Techniques for Geological Mixture in a Laboratory Setup: Insights on the nature of mixing, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12226, https://doi.org/10.5194/egusphere-egu23-12226, 2023.

EGU23-13163 | ECS | Orals | GI2.1

High-resolution grain-size analysis and non-destructive hyperspectral imaging of sediments from the Gaoping canyon levee to establish past typhoon and monsoon activities affecting Taiwan during the late Holocene

Joffrey Bertaz, Kévin Jacq, Christophe Colin, Zhifei Liu, Maxime debret, Hongchao Zhao, and Andrew Tien-Shun Lin

Non-destructive and high-resolution hyperspectral analyses are widely used in planetary and environmental sciences and in mining exploration. In recent years, the scanning method was applied to lacustrine sediment cores in complement to XRF core scanning. However, this approach was rarely applied to marine sediments. The Gaoping canyon, located south of Taiwan island, is connected to the Gaoping River and is a very active canyon with large sediment transfer capacity. In particular, about 4 typhoon-driven hyperpycnal flows have been recorded by mooring systems in every recent year. Studying their frequency and intensity responding to past climate and environmental changes is a key to understand future tropical storm frequency and related climate variability. Core MD18-3574 was collected on the western levee of the Gaoping canyon and displays numerous fine laminations (millimetric to centimetric) recording the deposition of the gravity flows occurring in the canyon and on the slope. In this study, we combined non-destructive analyses such as XRF core scanning and hyperspectral imaging with high-resolution grain size and XRD bulk mineralogy analyses to understand the sedimentological and geochemical variations at the scale of the laminae. Core MD18-3574 sediments consist mainly of fine silt, presenting an alternance of fine-grained and coarse-grained laminations. The average mean grain size is 13.4 µm ranging from 9 to 20.5 µm. Thick coarser grained laminations are showing grain size distributions and asymmetric sorting of typical turbidite sequence. Grain size and bulk mineralogy display great visual and statistical correlation with XRF (Fe/Ca, Si/Al) and hyperspectral proxies (sediment darkness (Rmean), Clay_R2200). Principal component analyses (PCA) demonstrates that darker laminae are composed of coarser sediments with high Si/Al (quartz and feldspar-rich) and Clay_R2200 values and low Fe/Ca (calcite-rich) resulting from gravity flows. Inversely, lighter laminae consist of finer sediments with low Si/Al (muscovite and illite-rich), Clay_R2200 and high Fe/Ca resulting from hemipelagic deposition. Thus, such interpretation was extended to the core scale to identify gravity flows deposits layers. Moderate intensity tropical storm frequency is decreasing since the last 4 ka in response to the sea surface temperature (SST) decrease and enhanced East Asian winter monsoon since the middle Holocene. Tropical storm intensity increased after 2 ka in La Niña like periods indicating that the surge of super-typhoons hitting Taiwan could be triggered by El Niño Southern Oscillation (ENSO) state and variability. We can then assess that tropical storm activity is controlled by SST, monsoon system and ENSO conditions. This study brings new insights in the prediction of the ongoing climate change impacts on storms activity in the western Pacific Ocean.

How to cite: Bertaz, J., Jacq, K., Colin, C., Liu, Z., debret, M., Zhao, H., and Lin, A. T.-S.: High-resolution grain-size analysis and non-destructive hyperspectral imaging of sediments from the Gaoping canyon levee to establish past typhoon and monsoon activities affecting Taiwan during the late Holocene, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13163, https://doi.org/10.5194/egusphere-egu23-13163, 2023.

EGU23-13329 | ECS | Orals | GI2.1

Combined use of NDT methods for steel rebar corrosion monitoring

Giacomo Fornasari, Federica Zanotto, Andrea Balbo, Vincenzo Grassi, and Enzo Rizzo

This paper describes laboratory tests performed with an NDT geophysical methods: Ground Penetrating Radar (GPR), Self Potential (SP) and Direct Current (DC) methods in order to monitor the corrosion of a rebar embedded in concrete. Even if the GPR is a common geophysical method for reinforced concrete structures, the SP and DC techniques are not widely used. Rebar corrosion is one of the main causes of deterioration of engineering reinforced structures and this degradation phenomena reduces their service life and durability. Non-destructive testing and evaluation of the rebar corrosion is a major issue for predicting the service life of reinforced concrete structures.

Several new experiments were performed at Applied Geophysical laboratory of University of Ferrara, following the experiences coming from previous tests (Fornasari et al., 2022), where two reinforced concrete samples of about 50 cm x 30 cm were cast, with a central ribbed steel rebar of 10 mm diameter and 35 cm long, were partially immersed in a plastic box with salty and distilled water. In this experiment, we applied a new protocol, where an epoxy resin was used in order to focalize the corrosion only along the exposed part of the rebar. The steel rebar was partially painted with a waterproof resin in order to leave only the central part uncovered for a length of 8 cm. The same waterproof epoxy resin was applied on part of the concrete sample, in order to have a specific chlorides diffusion across a freeway zone of about 10cm x 8cm defined below the exposed rebar.

The experiments were carried out on two identically constructed reinforced concrete samples, exposed to distilled water (sample “A”) and the second, exposed to a salty water with chlorides (sample “B”). Both samples were partially immersed for only 1 cm form the lower surface. The sample B was immersed in a salty water plastic box with different NaCl concentrations. An initial NaCl concentration of 0.1 % was adopted for 7 days, then the concentration was increased to 1% and finally to 3.5% for further 7 days. The experiment was set up in two phases. In the first phase of this study, we monitored the "natural" corrosion occurred on sample "B" due to the diffusion of chlorides towards the steel rebar comparing the obtained data with those of sample "A" exposed to distilled water. In the second phase of the study, accelerated corrosion was applied to sample "B" in order to induce an increment of the corrosion phenomena. The accelerated corrosion was designed in order to reach different theoretical levels of mass weight loss in the steel rebar, which were of 2%, 5%, 10% and 20%. During the experiments, 2GHz C-Thrue GPR antenna, Multivoltmeter with non-polarized calomel referenced electrode for SP and ABEM Terrameter LS for resistivity data, were used to monitor the rebar corrosion monitoring. The collected data were used for an integration observation to detect the evolution of the corrosion phenomenon on the reinforcement steel rebar and to define a quantitative analysis of the phenomena.

How to cite: Fornasari, G., Zanotto, F., Balbo, A., Grassi, V., and Rizzo, E.: Combined use of NDT methods for steel rebar corrosion monitoring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13329, https://doi.org/10.5194/egusphere-egu23-13329, 2023.

EGU23-13720 | ECS | Posters on site | GI2.1

A fully customizable data management system for Built Cultural Heritage surveys through NDT

Irene Centauro, Teresa Salvatici, Sara Calandra, and Carlo Alberto Garzonio

A fully customizable data management system for Built Cultural Heritage surveys through NDT

The diagnosis of Built Cultural Heritage using non-invasive methods is useful to deepen the understanding of building characteristics, assessing the state of conservation of materials, and monitoring over time the effectiveness of restoration interventions.

Ultrasonic and sonic tests are Non-Destructive Techniques widely used to evaluate the consistency of historic masonry and stone elements and to identify on-site internal defects such as voids, detachments, fractures. These tests, in addition to being suitable for Cultural Heritage because they are non-invasive, provide a fundamental preliminary screening useful to better address further analysis.

Ultrasonic and Sonic velocity tests performed on monuments involve a lot of different information obtained from many surveys. It is therefore important to optimize the amount of data collected both during documentation and diagnostic phase, making them easily accessible and meaningful for analysis and monitoring. In addition, investigations set-up should be following a standard methodology, repeatable over time, suitable for different types of artifacts, and prepared for comparison with other techniques.

An integrated data management system is then also useful to support the decision-making processes behind maintenance actions.

This work proposes the development of a complete management IT solution for the Ultrasonic and Sonic measurements of different types of masonry, and stone artifacts. The system consists of a browser-based collaboration and document management platform, a mobile/desktop application for data entry, and a data visualization and reporting tool. This set of tools enable the complete processing of data, from the on-site survey to their analysis and visualization.

The proposed methodology allows the standardization of the data entry workflow, and it is scalable, so it can be adapted to different types of masonry and artifacts. Moreover, this system provides real-time verification of data, optimizes survey and analysis times, and reduces errors. The platform can be integrated with machine learning models, useful to gain insight from data.

This solution, aimed to improve the approach to diagnostics of Cultural Heritage, has been successfully applied by the LAM Laboratory of the Department of Earth Sciences (University of Florence) on different case studies (e.g., ashlar, frescoed walls, plastered masonries, stone columns, coat-of-arms, etc.) belonging to many important monuments.

How to cite: Centauro, I., Salvatici, T., Calandra, S., and Garzonio, C. A.: A fully customizable data management system for Built Cultural Heritage surveys through NDT, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13720, https://doi.org/10.5194/egusphere-egu23-13720, 2023.

EGU23-13934 | Orals | GI2.1

Pavements Layered Media Characterizations using deep learning-based GPR full-wave inversion

Li Zeng, Biao Zhou, Xiongyao Xie, and Sébastien Lambot

The possibility to estimate accurately the subsurface electric properties of the pavements from ground-penetrating radar (GPR) signals using inverse modeling is obstructed by the appropriateness of the forward model describing the GPR subsurface system. In this presentation, we improved the recently developed approach of Lambot et al. whose success relies on a stepped-frequency continuous-wave (SFCW) radar combined with an off-ground monostatic transverse electromagnetic horn antenna. The deep-learning based method were adopted to train an intelligent model including the waveform of the Green’s functions. The method was applied and validated in laboratory conditions on a tank filled with a two-layered sand subject to different water contents. Results showed agreement between the predictions of measured Green’s functions deep-learning model and the measured ones. Model inversions for the dielectric permittivity and heights of antenna further demonstrated for a comparison of presented method.

How to cite: Zeng, L., Zhou, B., Xie, X., and Lambot, S.: Pavements Layered Media Characterizations using deep learning-based GPR full-wave inversion, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13934, https://doi.org/10.5194/egusphere-egu23-13934, 2023.

EGU23-14658 | Orals | GI2.1

Influence of tectonic deformation on the mechanical properties of calcareous rocks: drawbacks of the non-destructive techniques

Elisa Mammoliti, Veronica Gironelli, Danica Jablonska, Stefano Mazzoli, Antonio Ferretti, Michele Morici, and Mirko Francioni

Discontinuity surfaces are well known to influence the mechanical behaviour of rocks under compression. Non-destructive techniques, such as ultrasonic pulse velocity and sclerometers, are increasingly used to estimate uniaxial compressive strength of rocks. In this study, several core samples derived from the doubling works of the railway network near Genga (Marche Region, Central Italy) were analysed in order to assess the influence of the structural geological context (proximity to folds, faults etc.) and tectonic deformation on rock strength. Tests were conducted in rock specimens through: i) conventional uniaxial compressive experiment, ii) non-destructive rebound-based methods such as Schmidt Hammer and Equotip and iii) ultrasound. In this way, it was possible to make a critical analysis of the use of these techniques in the estimation of the uniaxial compressive strength (considering also information about discontinuity type, orientation and nature of the filling). Finally, a petrographic analysis using optical microscope has been undertaken as a support to the observations derived from the analysis at the sample scale. The results indicate that there are two main factors influencing the strength at the scale of the specimen. The first and most decisive factor is the presence of natural pre-existing fractures. The second is the tectonic deformation ratio: the greater the deformation is, the little the strength. Furthermore, through the combined use of uniaxial compressive experiment, non-destructive rebound-based methods and ultrasounds it was possible to highlights the advantages and limitations of each technique and define/propose new guidelines for their use.

How to cite: Mammoliti, E., Gironelli, V., Jablonska, D., Mazzoli, S., Ferretti, A., Morici, M., and Francioni, M.: Influence of tectonic deformation on the mechanical properties of calcareous rocks: drawbacks of the non-destructive techniques , EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14658, https://doi.org/10.5194/egusphere-egu23-14658, 2023.

EGU23-14846 | ECS | Orals | GI2.1

Combined NDT data for road management through BIM models

Luca Bertolini, Fabrizio D'Amico, Antonio Napolitano, Jhon Romer Diezmos Manalo, and Luca Bianchini Ciampoli

One of the main priorities for road administrations and stakeholders is the management and monitoring of critical infrastructures, especially transportation infrastructures. In this context, Building Information Modeling (BIM) can be one of the more effective methodologies to be used to optimize the management process. In Italy, several laws and regulations have been issued, making the use of BIM procedures mandatory for the design of new infrastructures and emphasizing its role in the management of existing civil works [1, 2].

Monitoring operations of transportation infrastructures are generally conducted by on-site surveys. Non-Destructive Testing methods (i.e., GPR, LiDAR, Laser Profilometer, InSAR, etc.) have been used to perform these inspections as their outputs have been proven to be effective in determining the conditions of the infrastructure and its assets [3]. Moreover, BIM methodology could prove a valuable tool to manage the data provided by these surveys, as it consists in the creation of digital models capable of containing information related to the object that they are representing. These models can be used to store over time the different information obtained by the NDT surveys to carry out integrated analysis on the conditions of the infrastructure [4].

This study aims to analyze a potential BIM process capable of integrating different NDT surveys’ outputs, to generate an informative digital model of an infrastructure and its assets. The proposed methodology is then able to merge the data provided by the inspections, which is typically obtained by different operators and comes in different file formats, in a single BIM model. The main goal of the research is to provide a process to optimize the management procedures of transportation infrastructures, by creating digital models capable of reducing the problems typically associated with the monitoring and maintenance of these critical civil works. By merging different information in a single environment and relying on survey data that are commonly analyzed separately, an integrated analysis of the infrastructure can be carried out and data loss can be reduced.

The study was developed by relying on real data, obtained from on-site surveys carried out over Italian infrastructures. As different outputs have been collected, BIM models of different assets of the analyzed infrastructures were defined. Preliminary results have shown that the proposed methodology can be a viable tool for optimizing the management process of these critical civil works.

Acknowledgements

The research is supported by the Italian Ministry of Education, University and Research under the National Project “Extended resilience analysis of transport networks (EXTRA TN): Towards a simultaneously space, aerial and ground sensed infrastructure for risks prevention”, PRIN 2017. Prot. 20179BP4SM.

References

[1] MIT, 2018. Ministero delle Infrastrutture e dei Trasporti, D. Lgs 109/2018

[2] MIT, 2021. Ministero delle Infrastrutture e dei Trasporti, D.M. 312/2021

[3] D’Amico F. et al., 2020. Integration of InSAR and GPR Techniques for Monitoring Transition Areas in Railway Bridges. NDT&E Int

[4] D’Amico, F. et al., 2022. Integrating Non-Destructive Surveys into a Preliminary BIM-Oriented Digital Model for Possible Future Application in Road Pavements Management. Infrastructures 7, no. 1: 10

How to cite: Bertolini, L., D'Amico, F., Napolitano, A., Manalo, J. R. D., and Bianchini Ciampoli, L.: Combined NDT data for road management through BIM models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14846, https://doi.org/10.5194/egusphere-egu23-14846, 2023.

EGU23-14899 | ECS | Orals | GI2.1

Fusion of in-situ and spaceborne sensing for environmental monitoring

Konstantinos Karyotis, Nikolaos Tsakiridis, and George Zalidis

Measuring soil reflectance in the field, rather than in a laboratory setting, can be very useful when it comes to numerous applications such as mapping the distribution of various soil properties, especially when prompt estimations are needed. Recent advances in spectroscopy, and specifically in the development of low-cost Micro-Electro-Mechanical-Systems (MEMS) based spectrometers, pave the way for developing real-time applications in agriculture and environmental monitoring. Compared to high-end spectrometers, whose spectral range extends from Visible (VIS) and Near-InfraRed (NIR) to Shortwave InfraRed (SWIR), MEMS cover limited parts of the electromagnetic spectrum resulting in missing important information. In parallel, new space missions such as Planet Fusion are operationally ready and provide optical imagery (RGB and NIR) with high spatial (3m) and temporal (daily) resolution. To this end, we assessed the potential of augmenting the bands captured from a commercial MEMS sensor (Spectral Engines Nirone S2.2 @ 1750 – 2150 nm) by adjoining the Planet Fusion bands at the exact sampling date and location that in-situ scans originate.

Employing the above, a set of portable MEMS was used at a pilot area in Cyprus (Agia Varvara, Nicosia district) to develop a regional in-situ Soil Spectral Library (SSL). A set of 60 distinct locations were selected for capturing in situ spectral reflectance after the stratification of Planet Fusion pixels of the pilot area, while a physical soil sample was analyzed at the laboratory for the determination of Soil Organic Carbon (SOC) content. During the visit, topsoil moisture was also measured.

The resulting SSL, containing the in-situ spectra, SOC, and moisture content was further augmented by the 4 bands of Planet Fusion imagery acquired on the exact date of the field visit. At this stage, three Random Forest models for SOC content estimation were fitted using as explanatory variables initially only the MEMS data with moisture content, then Planet Fusion bands, and finally all three available inputs.

The results presented an observable decrease in RMSE of SOC content estimations when fusing in-situ with spaceborne data, highlighting the importance of the information contained at VIS-NIR when modeling SOC. On the other hand, the synergy of the two sensors is mutually beneficial; SOC absorption bands can also be found in the SWIR region and are hard to detect with remote sensing means since they fall within the strong water absorption region (around 1950 nm). MEMS-based systems operating at the SWIR part can support this process, and if combined with ancillary environmental measurements such as soil moisture, can provide a cost-effective solution for measuring SOC and other soil-related parameters. To loosen the necessity of laboratory analysis, it is necessary to establish protocols and guidelines for spectral data collection and management to ensure that the data collected is consistent and of high quality and develop representative SSLs that can be used to serve different modeling scenarios.

How to cite: Karyotis, K., Tsakiridis, N., and Zalidis, G.: Fusion of in-situ and spaceborne sensing for environmental monitoring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14899, https://doi.org/10.5194/egusphere-egu23-14899, 2023.

EGU23-14981 | ECS | Orals | GI2.1

Implementation of a Digital Twin integrating remote sensing information for network-level infrastructure monitoring

Antonio Napolitano, Valerio Gagliardi, Luca Bertolini, Jhon Romer Diezmos Manalo, Alessandro Calvi, and Andrea Benedetto

Nowadays, there is an emerging demand from public authorities and managing bodies, to evaluate the overall health of infrastructures and identify the most critical transport assets. Considering the national-scale level, thousands of transport infrastructure are in critical conditions and require urgent maintenance actions. Currently, most of available Digital Twins (DT) allow to explore and visualize data including limited kind of information. This issue still limits the operative and practical use by infrastructure owners, that require fast solutions for managing several amount of data. Moreover, this idea is perfectly in line with European and national actions related to the development of a DT of the earth’s systems, including the “DestinE” programme of the European Commission by EUSPA and the European Space Agency (ESA). For this purpose, a dynamic DT model of a critical infrastructure is developed, using the available data about design information, historical maintenance operations and monitoring surveys based on satellite imageries.

In this context, this study presents an innovative concept of Digital Twin, which integrates all the details coming from NDTs surveys, on-site inspections and satellite-based information, to store, manage and visualize valuable information. This is made possible by analysing the main several gaps and limitations of existing platforms, providing a viable integrated solution developing an upgradable strategic analysis tool. To this purpose, remote sensing methods are identified as viable technologies for continuous monitoring operations. More specifically, data coming from satellites and the processing techniques, such as the Multi-Temporal SAR Interferometry approach, are strategic for the continuous monitoring of the displacements associated to transport infrastructures. An advantage of these techniques is the lighter data-processing required for the assessment of displacements and the detection of critical areas [1, 2].

The study introduces two main levels of innovation. The first one is associated to the integrated approach for transportation planning, integrating quantitative data from multi-sources, into the more traditional territorial analysis models. The second one is related to the technological engineering discipline, and it consists of the fusion of observation data from multi-source, with the last-generation dynamic data connected to the environment.

Acknowledgements

This research is supported by the Project “M.LAZIO”, accepted and funded by the Lazio Region, Italy.

References

[1] D'Amico, F. et al., “Implementation of an interoperable BIM platform integrating ground based and remote sensing information for network-level infrastructures monitoring”, Spie Remote Sensing 2022.

[2] Gagliardi, V. et al., “Bridge monitoring and assessment by high-resolution satellite remote sensing technologies”, Spie Future Sensing Technologies 2020.

How to cite: Napolitano, A., Gagliardi, V., Bertolini, L., Manalo, J. R. D., Calvi, A., and Benedetto, A.: Implementation of a Digital Twin integrating remote sensing information for network-level infrastructure monitoring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14981, https://doi.org/10.5194/egusphere-egu23-14981, 2023.

EGU23-15542 | ECS | Orals | GI2.1

Novel perspectives in transport infrastructure management: Data-Fusion, integrated monitoring and augmented reality

Valerio Gagliardi, Luca Bianchini Ciampoli, Fabrizio D'Amico, Alessandro Calvi, and Andrea Benedetto

Infrastructure networks are crucial elements to ensure the sustainability of the current development model in which the movement of people and goods is essential. On the other hand, transport assets are increasingly exposed to several issues, including climatic conditions changing, vulnerability and exposure to natural hazards such as hydraulic, geomorphological, landslides and seismic phenomena, which can affect the structural integrity causing damages and deteriorations. The context is made even more serious by the degradation of materials and the progressive ageing of infrastructure, often accelerated by environmental conditions and inadequate, or not always effective, maintenance actions. This requires the investigation of novel methods for the large-scale detection of network-scale linear infrastructures, and simultaneously, of detail to diagnose causes and determine the priorities for the most effective countermeasures.

The proposed solution is based on a Data-Fusion approach, merging data coming from multi-source and multi-scale data, to enhance the interpretation process in a holistic sense. The information comes from spaceborne Multi-temporal SAR Interferometry, complemented by more detailed aerial data, detected by UAVs and Ground Based Non-Destructive Testing Methods, including laser scanner surveys for resolution and digital integrability, high-resolution camera measurements assisted by artificial intelligence for the surface degradation and from prospecting data collected by Ground Penetrating Radar technology. All these data can be simultaneously analyzed into a comprehensive digital platform, providing a useful tool to support operators and public bodies to prioritize maintenance actions.

The digital platform can be investigated also using augmented reality tools, capable of generating and reproducing the Digital Twin of the inspected infrastructure into a real environment. This allows any monitoring evaluation through a diagnostic technique that integrates spatial, aerial, ground-based and geophysical surveys, allowing navigation within the infrastructure. Potential applications are numerous, ranging from mapping of wide areas affected by potential criticality to the definition of the main vulnerabilities related to the seismic and hydraulic risks, the analysis of land changes surrounding the assets following extreme natural events, and the reconstruction of historical deformative trends of roads, railways and bridges through the interpretation of SAR data.

Acknowledgments

This research is supported by the Italian Ministry of Education, University, and Research under the National Project “EXTRA TN”, PRIN2017, Prot. 20179BP4SM. In addition, this research is supported by the Project “MLAZIO” funded by Lazio Region (Italy).

How to cite: Gagliardi, V., Bianchini Ciampoli, L., D'Amico, F., Calvi, A., and Benedetto, A.: Novel perspectives in transport infrastructure management: Data-Fusion, integrated monitoring and augmented reality, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15542, https://doi.org/10.5194/egusphere-egu23-15542, 2023.

EGU23-16471 | ECS | Orals | GI2.1

Hydrogen isotope fractionation between leaf wax compounds and source water in tropical angiosperms

Amrita Saishree, Shreyas Managave, and Vijayananda Sarangi

The hydrogen isotope fractionation between leaf wax compounds and source water, the apparent fractionation (ε_app), necessary for the reconstruction of hydrogen isotopic composition (δD) of precipitation, is mainly assessed through field and transect studies. The current ε_app dataset, however, exhibit a bias toward mid-latitude regions of the Northern Hemisphere. Here we report the results of an outdoor experiment wherein four evergreen and three deciduous species were grown with water of known δD value (-1.8‰) in a tropical semi-arid monsoon region. This allowed us to estimate ε_app more accurately and also quantify ε_app variability within a species and among different species. Among-species ε_app values varied by -119 ± 23‰ (for n-alkane of chain length n-C₃₁) and -126 ± 27‰ (for n-alkanoic acid of chain length n-C₃₀). The similarity of the among-species variability in ε_app reported here and that observed in field and transect studies suggested the species-effect, rather than uncertainty in δD of source water, control the uncertainty in community-averaged ε_app. The fractionation of δD between n-C₂₉ alkane and n-C₃₀ alkanoic acid (ε_29/30) and between n-C₃₁ alkane and n-C₃₂ alkanoic acid (ε_31/32) were 7 ± 25‰ and 6 ± 15‰, respectively, suggesting minimal fractionation of hydrogen isotopes during decarboxylation. Further, as we did not observe a systematic difference between the ε_app of deciduous and evergreen species; changes in the relative proportion of this vegetation in a community might not affect its ε_app value.

How to cite: Saishree, A., Managave, S., and Sarangi, V.: Hydrogen isotope fractionation between leaf wax compounds and source water in tropical angiosperms, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16471, https://doi.org/10.5194/egusphere-egu23-16471, 2023.

EGU23-16632 | ECS | Orals | GI2.1

Development of a flexible 2D DC Resistivity modelling technique for use in space domain

Deepak Suryavanshi and Rahul Dehiya

Geoelectric non-destructive imaging and monitoring of the earth's subsurface requires robust and adaptable numerical methods to solve the governing differential equation. Most of the time, the DC data is acquired along a straight line. Hence, we solve the DC problem for the 2D case. But the source for the DC method exhibits a 3D nature. To account for the source's 3D nature, the 2D DC resistivity modeling is often carried out in the wavenumber domain. There have been studies that suggest ways for the selection of optimum wavenumbers and weights. But, this does not guarantee a universal choice of wavenumbers. The chosen wavenumbers and related weights strongly influence the precision of the resulting solution in the space domain. Many forward modeling studies demonstrate that selecting effective wavenumbers is challenging, especially for complicated models with topography, anisotropy, and significant resistivity differences. Moreover, forward modeling requires many wavenumbers as the models get more complex.

This study focuses on developing a method that can completely omit wavenumbers for 2D DC resistivity modeling. The present work finds its motivation in a numerical experiment on a simple half-space model. Since the analytical response for such a model can be easily calculated, we match the analytical solution against the responses obtained from various wavenumbers and weights used in the literature. All the responses deviated from the analytical solution after a certain distance, and none of them were found to be accurate for large offsets. It was discovered after thorough testing of the numerical scheme that the wavenumbers selected for the forward modeling significantly impacted how practical the approach is for large offsets.

To overcome this problem, a new boundary condition is derived and implemented in the existing numerical scheme. The numerical scheme chosen to perform the forward modeling is Mimetic Finite Difference Method (MFDM). We consider that the source is placed on the origin of the coordinate system. This removes the dependency of the source term, expressed in the Fourier domain, on the wavenumber. The solution obtained by solving the resulting equation will be an even function of the wavenumber and be real-valued. This ensures that the potential in the space domain for the 2D model will also be a real-valued even function with a symmetry about a plane perpendicular to the strike direction and passing through the origin. Because the first-order derivative of an even function at the plane of symmetry vanishes, mathematically, it can be expressed as a Neumann boundary condition at the considered plane. Therefore, we propose a scheme to solve the 2D resistivity problem in the space domain using the boundary condition mentioned here.

The developed algorithm is tested on isotropic and anisotropic two-layer models with large contrasts. It is found that the numerical solutions obtained using the modified boundary condition described above show considerable accuracy even for large offsets when compared with the analytical solution. On the other hand, the results obtained using available wavenumbers in the literature are also compared and are found to deviate considerably from the analytical solution at large offsets.

How to cite: Suryavanshi, D. and Dehiya, R.: Development of a flexible 2D DC Resistivity modelling technique for use in space domain, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16632, https://doi.org/10.5194/egusphere-egu23-16632, 2023.

EGU23-16864 | Posters on site | GI2.1

Data fusion in civil engineering: personal experience, vision and historical considerations

Andrea Benedetto

Approximately eight years ago, after a research activity that I started in the nineties on the application of GPR and, later, of NDTs to civil engineering, I realized that no technology can be considered as self standing. This is the consequence of the high complexity related to the civil engineering works and to the highly unpredictable impacts of ordinary processes and exceptional natural events. At the beginning of this century it was clear that a reliable and comprehensive monitoring of a phenomenon affecting bridges, tunnels, structures, or any civil engineering work is possible only by integrating data from different sources.

GPR was at that time a very promising technology, and many investigated in this field measuring e.g. pavement deformation, asphalt moisture, ballast degradation, also the mechanical properties of materials. The accurate outcomes represent a great step forward for the science in this sector, but the final results demonstrated to be partial, because the approach failed under a holistic perspective.

So, in the second decade of 2000, the need of a novel paradigm for investigation raised, in order not only to identify and quantify the problem, but also to diagnose its causes.

It was the stimulus to fuse data from different NDTs, under the assumption that information A and B give much more than A+B. It means that one information (A) can be explanatory of one or more characters contained in a second (B) that cannot be inferred by the knowledge of only one single standing information (B).

Based on this I decided, with very high level international colleagues, to establish a new session at EGU. It was the 2018. Today the sixth edition!

During these years a number ranging from 80 to 120 of researchers took part to each session. Also the number of countries involved is impressive, ranging for each session from 10 to 17. The institutions ranged from 36 to 50.

The number of contributions presented in the five editions is 141.

After 2018 we have seen several special issues of prominent journals were dedicated to data fusion. Recently, beyond the typical technologies as GPR, UT, ERT, a great attention was given to Lidar, Satellite and UAV.

Data fusion was also directed to other interesting and promising fields as archaeology, agriculture, urban planning, only to cite a few.

I would like to underline that this great interest started in Europe and in USA, but actually the geographical coverage is much wider and it includes at a same level also Asiatic and emerging countries.

There is now a new frontier that has to be. My vision is that this holistic approach can be used to develop an innovative immersive environment through the integration in augmented reality platforms on which a digital twin can be generated and dynamically upgraded through an adaptive interface, as well as using AI and machine learning paradigms.

How to cite: Benedetto, A.: Data fusion in civil engineering: personal experience, vision and historical considerations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16864, https://doi.org/10.5194/egusphere-egu23-16864, 2023.

EGU23-17489 | ECS | Orals | GI2.1

Integration of non-destructive surveys for BIM-based and structural-verified digital reconstruction of archaeological sites

Roberta Santarelli and Alessandra Ten

Building Information Modeling is a software-based parametric design approach that allows a full interoperability between the various actors involved in a design or management process. Notwithstanding It has been specifically created for buildings projects, its use has been adapted to a wide range of applications, including transport infrastructure design and, more recently, cultural heritage. In regard to this field, it has been mainly applied to raise accuracy and effectiveness of restoring and stabilization activities for historical architectures.
The present study aims at demonstrating how the use of BIM may return remarkable outcomes in improving the current quality level of digital valorisation and virtual reconstructions of historical structures, especially when their rate of conservation is limited. Indeed, even though current digital reconstruction models are, usually, verified under an archaeological perspective, their structural consistency is never tested. This involves that many virtual reconstruction models are likely to represent structures that are historically accurate but that have no structural sense as, according to their geometric features and the construction materials/techniques, they would not stand their weight.
In this perspective, this study proposes a novel BIM-based methodology capable of both driving the archaeological reconstruction hypotheses and testing the reconstruction hypotheses on a structural basis. The model can be schematically represented by the following process:
1- Survey of the emerging: acquisition of data from superficial archaeological surveys (topographic data, laser scanner, aero photogrammetry, satellite images)
2- Survey of the hidden: acquisition of data from hypogeal surveys (georadar, electrical tomography, magnetometry);
3- Mechanical characterization: gathering of information concerning the materials of the find, proven in their mechanic qualities also through load stress tests;
4- Virtual reconstruction: proposal of a possible hypothesis of virtual reconstruction linked to structural and morphological features known to be present in the referred historical periods;
5- Structural test: engineering and structural confirmation of the forwarded hypothesis by means of finite element algorithms.
The proposed methodology was tested on the archaeological area of the Villa and Circus of Maxentius along the Ancient Appian Way in Rome; all the planned activities have been shared and authorized by the Sovrintendenza Capitolina ai Beni Culturali, within the context of the Project BIMHERIT, funded by Regione Lazio (DTC Lazio Call, Prot. 305-2020-35609).

How to cite: Santarelli, R. and Ten, A.: Integration of non-destructive surveys for BIM-based and structural-verified digital reconstruction of archaeological sites, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17489, https://doi.org/10.5194/egusphere-egu23-17489, 2023.

EGU23-1081 | Orals | GI2.2

Fukushima and Chernobyl: similarities and differences of radiocesium behavior in the soil-water environment

Alexei Konoplev

In the wake of Chernobyl and Fukushima accidents radiocesium has become a radionuclide of most environmental concern. The ease with which this radionuclide moves through the environment and is taken up by plants and animals is governed by its chemical forms and site-specific environmental characteristics. Distinctions in climate and geomorphology, as well as ¹³⁷Cs speciation in the fallout result in differences in migration rates of ¹³⁷Cs in the environment and rates of its natural attenuation. In Fukushima areas ¹³⁷Cs was found to be strongly bound to soil and sediment particles, its bioavailability being reduced as a result. Up to 80% of the deposited ¹³⁷Cs on the soil were reported to be incorporated in hot glassy particles (CsMPs) insoluble in water. Disintegration of these particles in the environment is much slower than of Chernobyl-derived fuel particles. The higher annual precipitation and steep slopes in Fukushima contaminated areas are conducive to higher erosion and higher total radiocesium wash-off. Typhoons Etou in 2015 and Hagibis in 2019 demonstrated the pronounced redistribution of ¹³⁷Cs on river watersheds and floodplains, and in some cases natural self-decontamination occurred. Among the common features in ¹³⁷Cs behavior in Chernobyl and Fukushima is a slow decrease in ¹³⁷Cs activity concentration in small, closed, and semi-closed lakes and its particular seasonal variations: increase in summer and decrease in winter.

How to cite: Konoplev, A.: Fukushima and Chernobyl: similarities and differences of radiocesium behavior in the soil-water environment, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1081, https://doi.org/10.5194/egusphere-egu23-1081, 2023.

EGU23-1607 | Posters on site | GI2.2

Eight-year variations in atmospheric radiocesium in Fukushima city and simulated resuspension from contaminated ground surfaces in eastern Japan

Mizuo Kajino and Akira Watanabe

After the Fukushima nuclear accident, atmospheric ¹³⁴Cs and ¹³⁷Cs measurements were taken in Fukushima city for 8 years, from March 2011 to March 2019. The airborne surface concentrations and deposition of radiocesium (radio-Cs) were high in winter and low in summer; these trends are the opposite of those observed in a contaminated forest area. The effective half-lives of ¹³⁷Cs in the concentrations and deposition before 2015 (0.754 and 1.30 years, respectively) were significantly shorter than those after 2015 (2.07 and 4.69 years, respectively), which was likely because the dissolved radio-Cs was discharged from the local terrestrial ecosystems more rapidly than the particulate radio-Cs. In fact, the dissolved fractions of precipitation were larger than the particulate fractions before 2015, but the particulate fractions were larger after 2016. X-ray fluorescence analysis suggested that biotite may have played a key role in the environmental behavior of particulate forms of radio-Cs after 2014.

Resuspension of ¹³⁷Cs from the contaminated ground surface to the atmosphere is essential for understanding the long-term environmental behaviors of ¹³⁷Cs. We assessed the ¹³⁷Cs resuspension flux from bare soil and forest ecosystems in eastern Japan in 2013 using a numerical simulation constrained by surface air concentration and deposition measurements. In the estimation, the total areal annual resuspension of ¹³⁷Cs is 25.7 TBq, which is equivalent to 0.96% of the initial deposition (2.68 PBq). The current simulation underestimated the ¹³⁷Cs deposition in Fukushima city in winter by more than an order of magnitude, indicating the presence of additional resuspension sources. The site of Fukushima city is surrounded by major roads. Heavy traffic on wet and muddy roads after snow removal operations could generate superlarge (approximately 100 μm in diameter) road dust or road salt particles, which are not included in the model but might contribute to the observed ¹³⁷Cs at the site.

The current presentation based on the two published papers: Watanabe et al., ACP, https://doi.org/10.5194/acp-22-675-2022 (2022) and Kajino et al., ACP, https://doi.org/10.5194/acp-22-783-2022 (2022). The presenters would like to thank all of the co-authors of the two papers for their significant contributions.

How to cite: Kajino, M. and Watanabe, A.: Eight-year variations in atmospheric radiocesium in Fukushima city and simulated resuspension from contaminated ground surfaces in eastern Japan, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1607, https://doi.org/10.5194/egusphere-egu23-1607, 2023.

EGU23-2540 | Posters on site | GI2.2

Hydrological setting control 137Cs and 90Sr concentration at headwater catchments in the Chornobyl Exclusion Zone

Yasunori Igarashi, Yuichi Onda, Koki Matsushita, Hikaru Sato, Yoshifumi Wakiyama, Hlib Lisovyi, Gennady Laptev, Dmitry Samoilov, Serhii Kirieiev, and Alexei Konoplev

Concentration-discharge relationships are widely used to understand the hydrologic processes controlling river water chemistry. We investigated how hydrological processes affect radionuclide concentrations (¹³⁷Cs and ⁹⁰Sr) in surface water in the headwater catchment at the Chornobyl exclusion zone in Ukraine. In flat wetland catchment, the depth of saturated soil layer changed little throughout the year, but changes in saturated soil surface area during snowmelt and immediately after rainfall affected water chemistry by changing the opportunities for contact between suface water and the soil surface. On the other hand, slope catchments with little wetlands, the water chemistry in river water is formed by changes in the contribution of "shallow water" and "deep water" due to changes in the water pathways supplied to the river. Dissolved and suspended ¹³⁷Cs concentrations did not correlate with discharge rate or competitive cations, but the solid/liquid ratio of ¹³⁷Cs showed a significant negative relationship with water temperature, and further studies are needed in terms of sorption/desorption reactions. ⁹⁰Sr concentrations in surface water were strongly related to water pathways for each the catchments. The contact between surface water and the soil surface and the change in the contribution of shallow and deep water to stream water could changes ⁹⁰Sr concentrations in surface water for in wetland and slope catchments, respectively. In this study, we revealed that the radionuclide concentrations in rivers in Chornobyl is strongly affected by the water pathways at headwater catchments.

How to cite: Igarashi, Y., Onda, Y., Matsushita, K., Sato, H., Wakiyama, Y., Lisovyi, H., Laptev, G., Samoilov, D., Kirieiev, S., and Konoplev, A.: Hydrological setting control 137Cs and 90Sr concentration at headwater catchments in the Chornobyl Exclusion Zone, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2540, https://doi.org/10.5194/egusphere-egu23-2540, 2023.

EGU23-2561 | Posters on site | GI2.2

Dispersion of particle-reactive elements caused by the phase transitions in scavenging

Kyeong Ok Kim, Vladimir Maderich, Igor Brovchenko, Kyung Tae Jung, Sergey Kivva, Katherine Kovalets, and Haejin Kim

A generalized model of scavenging of the reactive radionuclide 239,240Pu was developed, in which the sorption-desorption processes of oxidized and reduced forms on multifraction suspended particulate matter are described by first-order kinetics. One-dimensional transport-diffusion-reaction equations were solved analytically and numerically. In the idealized case of instantaneous release of 239,240Pu on the ocean surface, the profile of concentrations asymptotically tends to the symmetric spreading bulge in the form of a Gaussian moving downward with constant velocity. The corresponding diffusion coefficient is the sum of the physical diffusivity and the apparent diffusivity caused by the reversible phase transitions between the dissolved and particulate states. Using the method of moments, we analytically obtained formulas for both the velocity of the center mass and apparent diffusivity. It was found that in ocean waters that have oxygen present at great depths, we can consider in the first approximation a simplified problem for a mixture of forms with a single effective distribution coefficient, as opposed to considering the complete problem. This conclusion was confirmed by the modeling results for the well-ventilated Eastern Mediterranean. In agreement with the measurements, the calculations demonstrate the presence of a maximum that is slowly descending for all forms of concentration. The ratio of the reduced form to the oxidized form was approximately 0.22-0.24. At the same time, 239,240Pu scavenging calculations for the anoxic Black Sea deep water reproduced the transition from the oxidized to reduced form of 239,240Pu with depth in accordance with the measurement data.

How to cite: Kim, K. O., Maderich, ., Brovchenko, ., Jung, . T., Kivva, ., Kovalets, ., and Kim, .: Dispersion of particle-reactive elements caused by the phase transitions in scavenging, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2561, https://doi.org/10.5194/egusphere-egu23-2561, 2023.

EGU23-3049 | ECS | Posters on site | GI2.2

Changes in Air Dose Rates due to Soil Water Content in Forests in Fukushima Prefecture, Japan

Miyu Nakanishi, Yuichi Onda, Hiroaki Kato, Junko Takahashi, Hikaru Iida, and Momo Takada

Radionuclides released and deposited by the 2011 Fukushima Daiichi Nuclear Power Plant accident caused an increase in air dose rates in forests in Fukushima Prefecture. It has been reported that air dose rates increase during rainfall, but we found that air dose rates decreased during rainfall in forests in Fukushima. This is said to be due to the shielding effect of soil moisture. This study aimed to develop a method for estimating changes in air dose rates due to rainfall even in the absence of soil moisture data. Therefore, we used the preceding rainfall (Rw), an indicator that also takes into account past rainfall; we calculated Rw in Namie-Town, Futaba-gun, Fukushima Prefecture from May to July 2020, and estimated air dose rates. In this area, air dose rates decreased with increasing soil moisture. Furthermore, air dose rates could be estimated by combining Rw with a half-life of 2 hours and 7 days, and by considering hysteresis in the absorption and drainage processes. The coefficient of determination (R²) exceeded 0.70 for the estimation of soil water content at this time. Furthermore, good agreement was also observed in the estimation of air dose rates from Rw (R² > 0.65). The same method was used to estimate air dose rates at the Kawauchi site from May to July 2019. Due to the high water repellency of the Kawauchi site, the increase in soil water content was very small and the change in air dose rate was almost negligible when soil water content was less than 15% and rainfall was less than 10 mm. This study enabled the estimation of soil water content and air dose rate from rainfall and captured the effect of rainfall on the decreasing trend of air dose rate. Therefore, in the future, This study can be used as an indicator to determine whether temporary changes in air dose rates are caused by influences other than rainfall. This study also contributes to the improvement of methods for estimating external dose rates for humans and terrestrial animals and plants in forests.

How to cite: Nakanishi, M., Onda, Y., Kato, H., Takahashi, J., Iida, H., and Takada, M.: Changes in Air Dose Rates due to Soil Water Content in Forests in Fukushima Prefecture, Japan, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3049, https://doi.org/10.5194/egusphere-egu23-3049, 2023.

EGU23-4152 | ECS | Orals | GI2.2

Modeling and sensitivity study of wet scavenging models for the Fukushima accident using 1-km-resolution meteorological field data

Shuhan Zhuang, Xinwen Dong, Yuhan Xu, and Sheng Fang

Wet scavenging modeling remains a challenge of the atmospheric transport of ¹³⁷Cs following the Fukushima Daiichi Nuclear Power Plant accident, which significantly influences the detailed spatiotemporal ¹³⁷Cs distribution. Till now, numerous wet deposition schemes have been proposed for ¹³⁷Cs, but it is often difficult to evaluate them consistently, due to the limited resolution of meteorological field data and detailed differences in model implementations. This study evaluated the detailed behavior of 25 combinations of in- and below-cloud wet scavenging models in the framework of the Weather Research and Forecasting-Chemistry model, using high-resolution (1 km × 1 km) meteorological input. The above implementation enables consistent evaluation with great details, revealing complex local behaviors of these combinations. The 1-km-resolution simulations were compared with simulations obtained previously using 3-km-resolution meteorological field data, with respect to the rainfall pattern of the east Japan during the accident, atmospheric concentrations acquired at the regional SPM monitoring sites and the total ground deposition. The capability of these models in reproducing local-scale observations were also investigated with a local-scale observations at the Naraha site, which his only 17.5 km from the Fukushima Daiichi Nuclear Power Plant. The performance of the ensemble mean was also evaluated. Results revealed that the 1-km simulations better reproduce the cumulative rainfall pattern during the Fukushima accident than those revealed by the 3-km simulations, but showing with spatiotemporal variability in accuracy. And rainfall below 1 mm/h is critical for the simulation accuracy. Those single-parameter wet deposition models that rely solely on the rainfall showed improvements in performance in the 1-km simulations relative to that in the 3-km simulations, because of the improved rainfall simulation in the 1-km results. Those multiparameter models that rely on both cloud and rainfall showed more robust performance in both the 3-km and -1km simulations, and the Roselle–Mircea model presented the best performance among the 25 models considered. Besides rainfall, wind transport showed substantial influence on the removal process of atmospheric ¹³⁷Cs, and it was nonnegligible even during periods in which wet deposition was dominant. The ensemble mean of the 1-km simulations better reproduces the high deposition area and the total deposition amount is closer to the observations than the 3-km simulation. At the local scale, the 1-km-resolution simulations effectively reproduced the ¹³⁷Cs concentrations observed at the Naraha site, but with deviations in peak timing, mainly because of biased wind direction. These findings indicate the necessity of a multi-parameter model for robust regional-scale wet deposition simulation and a refined wind and dispersion model for local-scale simulation of ¹³⁷Cs concentration.

How to cite: Zhuang, S., Dong, X., Xu, Y., and Fang, S.: Modeling and sensitivity study of wet scavenging models for the Fukushima accident using 1-km-resolution meteorological field data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4152, https://doi.org/10.5194/egusphere-egu23-4152, 2023.

EGU23-4697 | ECS | Orals | GI2.2

Quantifying the riverine sources of sediment and associated radiocaesium deposited off the coast of Fukushima Prefecture

Pierre-Alexis Chaboche, Wakiyama Yoshifumi, Hyoe Takata, Toshihiro Wada, Olivier Evrard, Toshiharu Misonou, Takehiko Shiribiki, and Hironori Funaki

The Fukushima-Daiichi Nuclear Power Plant (FDNPP) accident trigged by the Great East Japan Earthquake and subsequent tsunami in March 2011 released large quantities of radionuclides in terrestrial and marine environments of Fukushima Prefecture. Although radiocaesium (i.e. ¹³⁴Cs and ¹³⁷Cs) activity in these environments has decreased since the accident, the secondary inputs via the rivers draining and eroding the main terrestrial radioactive plume were shown to sustain high levels of ¹³⁷Cs in riverine and coastal sediments, which are likely deposited off the coast of the Prefecture. Accordingly, identifying the sources of sediment is required to elucidate the links between terrestrial and marine radiocaesium dynamics and to anticipate the fate of persistent radionuclides in the environment.

The objective of this study is to develop an original sediment source tracing technique to quantify the riverine sources of sediment and associated radionuclides accumulated in the Pacific Ocean. Target coastal sediment cores (n=6) with a length comprised between 20 and 60cm depth were collected during cruise campaigns between July and September 2022 at the Ota (n=2), Niida (n=1) and Ukedo (n=3) river mouths. Prior to gamma spectrometry measurements, sediment cores were opened and cut into 2 cm increments, oven-dried at 50°C for at least 48 hours, ground and passed through a 2-mm sieve.

Preliminary results regarding the spatial and depth distribution of radiocaesium in these samples show a strong heterogeneity, with highest radiocaesium levels (up to 134 ± 2 and 4882 ± 11 Bq kg^-1 for ¹³⁴Cs and ¹³⁷Cs, respectively) found in coastal sediment cores located at the Ukedo river mouth. On the opposite, no trace or low levels of Fukushima-derived radiocaesium were found in the Niida and in one sediment core of the Ota River mouths. Additional measurements will be conducted to determine the physico-chemical properties of this sediment, in order to select the optimal combination of tracers, which will then be introduced into un-mixing models. This increase knowledge will undoubtedly be useful for watershed and coastal management in the FDNPP post-accidental context.

How to cite: Chaboche, P.-A., Yoshifumi, W., Takata, H., Wada, T., Evrard, O., Misonou, T., Shiribiki, T., and Funaki, H.: Quantifying the riverine sources of sediment and associated radiocaesium deposited off the coast of Fukushima Prefecture, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4697, https://doi.org/10.5194/egusphere-egu23-4697, 2023.

EGU23-4925 | Posters on site | GI2.2

Verification of reproductivity of 137Cs activity concentration in the database by an ocean general circulation model

Daisuke Tsumune, Frank Bryan, Keith Lindsay, Kazuhiro Misumi, Takaki Tsubono, and Michio Aoyama

Radioactive cesium (¹³⁷Cs) is distributed in the global ocean due to global fallout from atmospheric nuclear tests, release from reprocessing plants in Europe, and supply to the ocean due to the Fukushima Daiichi Nuclear Power Plant accident. In order to detect future contamination by radionuclides, it is necessary to understand the global distribution of radionuclides such as ¹³⁷Cs. For this purpose, the IAEA is compiling a database of observation results (MARIS). However, since the spatio-temporal densities of observed data vary widely, it is difficult to obtain a complete picture from the database alone. Comparative validation using ocean general circulation model (OGCM) simulations is useful in interpreting these observations, and global ocean general circulation model (CESM2, POP2) simulations were conducted to clarify the behavior of ¹³⁷Cs in the ocean. The horizontal resolution is 1.125° longitude and 0.28° to 0.54° latitude. The minimum spacing near the sea surface is 10 m, and the spacing increases with depth to a maximum of 250 m with 60 vertical levels. Climatic values were used for driving force. As a source term for ¹³⁷Cs to the ocean, atmospheric fallout from atmospheric nuclear tests was newly established based on rainfall data and other data, and was confirmed to be more reproducible than before. Furthermore, the release from reprocessing plants in Europe and the leakage due to the accident at the Fukushima Daiichi Nuclear Power Plant were taken into account. 2020 input conditions were assumed to continue after 2020, and calculations were performed from 1945 to 2030. The simulated ¹³⁷Cs activities were found to be in good agreement, especially in the Atlantic and Pacific Oceans, where the observed densities are large. On the other hand, they were underestimated in the Southern Hemisphere, suggesting the need for further improvement of the fallout data. ¹³⁷Cs concentrations from the Fukushima Daiichi Nuclear Power Plant accident in March 2011 were generally in good agreement, although the reproducibility remained somewhat problematic due to insufficient model resolution. In other basins, the concentration characteristics were able to be determined, although the observed values were insufficient. Radioactivity concentrations of atmospheric nuclear test-derived ¹³⁷Cs may continue to be detected in the global ocean after 2030. The results of this simulation are useful for planning future observations to fill the gaps in the database.

How to cite: Tsumune, D., Bryan, F., Lindsay, K., Misumi, K., Tsubono, T., and Aoyama, M.: Verification of reproductivity of 137Cs activity concentration in the database by an ocean general circulation model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4925, https://doi.org/10.5194/egusphere-egu23-4925, 2023.

EGU23-4947 | ECS | Posters on site | GI2.2

Vertical distribution of radioactive cesium-rich microparticles in forest soil of Hamadori area, Fukushima Prefecture

Takahiro Tatsuno, Hiromichi Waki, Naoto Nihei, and Nobuhito Ohte

A lot of radionuclides were scattered after the Fukushima Daiichi Nuclear Power Plant (FDNPP) accident. Previous studies showed that there were FDNPP-derived radioactive cesium-rich microparticles (CsMPs) with the size of a few μm in the soil and river water around Fukushima Prefecture^[1]. CsMPs have high radioactive cesium (Cs) concentration per unit mass, therefore they can be one of the factor in overestimating the Cs concentration in samples. Because Cs in CsMPs may not react directly with clay particles unlike the Cs ion in liquid phase, it is considered that CsMPs work as Cs carrier in soils^[2]. However, unlike ionic Cs and Cs adsorbed onto clay particles, the distribution and dynamics of CsMPs in soils have not been clarified. In this study, we investigated vertical distribution of CsMPs in the forest soil and the soil properties in Fukushima Prefecture, Japan.

Soil samples were collected from the forest in the difficult-to-return zone, approximately 10 km away from the FDNPP. The undisturbed soil samples were collected from 0-35 cm soil depth at 5 cm intervals using core sampler to investigate soil properties. Furthermore, litter samples on the surface soil layer were collected. Using these samples, the vertical distribution of Cs concentration in the soil and Cs derived from CsMPs were investigated. Cs concentration in samples placed in 100 mL of U8 container was measured using a germanium semiconductor detector. Cs derived from CsMPs was evaluated using an Imaging plate with reference to the method ffor quantification of CsMPs^[3].

Like Cs adsorbed on the soil, CsMPs were also mostly distributed in the soil surface layer between o and 5 cm of soil depth. We considered that straining may be one of the mechanism of CsMPs retention on the soil surface. Bradford et al. (2006) ^[4]showed that straining might be a significant mechanism for colloid retention when the average particle size in the porous medium is less than 200 times larger than the colloidal particle size. In this study, assuming the CsMPs size of approximately 1 µm, the average particle size of the soil collected from surface layer 0-5 cm was less than 200 times that of CsMPs. However, the average particle size decreased in deeper layer than 5 cm, therefore, it was considered that straining mechanism could be stronger.

This work was supported by FY2022 Sumitomo Foundation and FY2022 Internal Project of　Institute of Environmental Radioactivity, Fukushima University.

References

[1]　Igarashi, Y. et al., 2019. J. Environ. Radioact. 205–206, 101–118.

[2] Tatsuno, T et al., 2022. J. Environ. Manage. 329, 116983.

[3]　Ikehara et al., 2018. Environ. Sci. Technol. 52, 6390–6398.

[4]　Bradford et al., 2003. Environ. Sci. Technol. 37, 2242–2250.

How to cite: Tatsuno, T., Waki, H., Nihei, N., and Ohte, N.: Vertical distribution of radioactive cesium-rich microparticles in forest soil of Hamadori area, Fukushima Prefecture, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4947, https://doi.org/10.5194/egusphere-egu23-4947, 2023.

EGU23-5042 | ECS | Posters on site | GI2.2

Changes in 90Sr transport dynamics in groundwater after large-scale groundwater drawdown in the vicinity of the cooling pond at the Chornobyl Nuclear Power Plant

Hikaru Sato, Naoaki Shibasaki, Maksym Gusyev, Yuichi Onda, and Dmytro Veremenko

Migration of long-lived radioactive ⁹⁰Sr introduced by nuclear accidents and radioactive waste requires long-term monitoring and protection management due to its half-life of 28.8 years and high mobility in water. Presently, 37 years have passed since the largest worldwide ⁹⁰Sr contamination was released and deposited around the Chornobyl Nuclear Power Plant (ChNPP). In the vicinity of the ChNPP, the water level of the cooling pond (CP) has declined since May 2014 following the decommissioning phase of the Unit 3 reactor. The drawdown of the CP lowered the groundwater level in a massive vicinity (about 70 km²), and the change in the groundwater system due to the drawdown has caused concerns about possible changes in ⁹⁰Sr concentrations in water and transport dynamics to the Pripyat River. Therefore, this study evaluated how ⁹⁰Sr transport dynamics were influenced due to changes in the groundwater flow system from 2011 to 2020 based on observed data and results of the groundwater flow simulation in the CP vicinity.

The numerical simulation was conducted from 2011 to 2020 on monthly time-step using USGS MODFLOW with PM11 GUI and calibrated to groundwater heads measured at monitoring wells. In the location between the CP and the Pripyat River, estimated pore velocities near the river were reduced compared to velocities before the CP drawdown due to the decrease in the hydraulic gradient between the CP and the river. Decrease in groundwater velocity results decrease in groundwater discharge and delay of ⁹⁰Sr transport. Therefore, the amount of ⁹⁰Sr transported from the CP to the river is smaller than the period prior to the CP drawdown. The reduced ⁹⁰Sr transport is expected to have less impact on the radioactivity in the river water even in the Pripyat River floodplain northwest of the CP where ⁹⁰Sr concentrations significantly increased after the CP drawdown. In addition, the measured and simulated changes in groundwater flow direction and velocity suggested the possibility of ⁹⁰Sr accumulation at the floodplain caused by stagnant groundwater from reduced velocity and additional ⁹⁰Sr infiltration from surrounding ponds located at the Pripyat River floodplain. Therefore, enhancing the current monitoring of ⁹⁰Sr concentrations near the floodplain would be needed for long-term monitoring and protection management to prevent the risk.

How to cite: Sato, H., Shibasaki, N., Gusyev, M., Onda, Y., and Veremenko, D.: Changes in 90Sr transport dynamics in groundwater after large-scale groundwater drawdown in the vicinity of the cooling pond at the Chornobyl Nuclear Power Plant, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5042, https://doi.org/10.5194/egusphere-egu23-5042, 2023.

EGU23-6019 | Orals | GI2.2

Transport of H-3 and I-129 in water and their uptake by marine organisms due to the planned release of Fukushima storage water

Roman Bezhenar, Hyoe Takata, and Vladimir Maderich

The 3D model THREETOX was applied for the long-term simulation of the planned release of radioactively contaminated water from Fukushima storage tanks to marine environment. Two radionuclides were considered: ³H that has the largest activity in tanks and ¹²⁹I that can caused the largest dose of radiation to human. The constant release rate of ³H equal to 22 TBq/y according to TEPCO estimations and the constant release rate of ¹²⁹I equal to 361 MBq/y according to estimations from the current study were used in the simulations.

The THREETOX model used monthly averaged currents from the KIOST-MOM model. A dynamic food web model was included in the THREETOX model. In the model, organisms uptake the activity directly from water and through the food chain. The food chain consists of phytoplankton, zooplankton, non-piscivorous (prey) fish, and piscivorous (predatory) fish. In case of ¹²⁹I, macro-algae was also considered. The modelling area covers Fukushima coastal waters and extends for 1600 km from the coast to the East. From North to South this area extends for 1300 km.

From model results, we can see how contamination will spread along the coast in different seasons. For example, in summer time the currents near the coast are directed to the North that leads to contamination of the Sendai Bay. This means that at different points along the coast, the concentration of radionuclides can periodically change according to currents that change during the year. Calculated concentrations of activity at several points along the coast of Japan, which correspond to largest cities in the area of interest, were extracted from model results. For example, calculated concentration of ³H in water in Tomioka point, which is quite close to FDNPP, sometimes can exceed 200 Bq/m³. In Soma point, the concentration will exceed 50 Bq/m³, while in point Iwaki-Onahama – 20 Bq/m³ at some moments of time. In other points, the calculated concentration of ³H in water will not exceed 10 Bq/m³ that is less than background concentration 50 Bq/m³. Concerning ¹²⁹I, its maximum concentration in water will be around 10^-3 – 10^-2 Bq/m³ in points close to FDNPP and around 10^-4 Bq/m³ in points further from the NPP that is around 100 000 times less than the calculated concentrations of ³H.

Calculated concentrations of OBT (organically bounded tritium) in predatory and prey fish are less than 0.01 Bq/kg in all points except FDNPP point where it is around 0.02 Bq/kg. This value is 10 times less than measured concentration of OBT in fish (0.2 Bq/kg) that was made in 2014 in the coastal area near the damaged NPP. Calculated concentrations of ¹²⁹I in predatory and prey fish are in the range 10^-6 – 10^-4 Bq/kg in all considered points. Concentrations of ¹²⁹I in macro-algae are about 100 times higher due to ability of iodine to accumulate in macro-algae.

How to cite: Bezhenar, R., Takata, H., and Maderich, V.: Transport of H-3 and I-129 in water and their uptake by marine organisms due to the planned release of Fukushima storage water, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6019, https://doi.org/10.5194/egusphere-egu23-6019, 2023.

EGU23-6026 | Orals | GI2.2

Dynamic change of dissolved Cs-137 from headwaters to downstream in the Kuchibuto River catchment

Yuichi Onda, Taichi Kawano, Keisuke Taniguchi, and Junko Takahashi

The Fukushima Daiichi Nuclear Power Plant (FDNPP) accident on March 11, 2011 resulted in the release of large amounts of radioactive cesium-137 (137Cs) into the environment. It is important to characterize the Cs-137 dynamics throughout the river from the headwaters to the downstream. Previous studies have suggested the importance of dissolved forms of Cs-137 in organic matter in small watersheds and dissolved forms in suspended solids in large watersheds. Since the concentration of suspended-form Cs has been shown to decrease significantly after decontamination in evacuated areas (Feng et al. 2022), this rapid decrease in suspended-form Cs-137 concentration can be used to determine the cause of dissolved-form Cs. Therefore, we attempted to evaluate whether the dissolved Cs-137 was derived from organic matter or suspended solids by comparing data before and after decontamination.

The objective of this study is to compare the decreasing trends of Cs-137 concentrations in decontaminated and undecontaminated areas based on long-term monitoring of suspended solids, dissolved solids, and coarse organic matter Cs-137 concentrations since 2011. The study area includes four headwater basins and four river basins (eight sites in total) in the Kuchibuto River watershed in the Yamakiya district of Fukushima Prefecture, located approximately 35 km northwest of the FDNPP.

In the Kuchibuto River watershed, a large inflow of decontaminated soil with low Cs-137 concentrations due to an increase in the amount of bare land caused by decontamination resulted in a rapid decrease in the concentration of suspended-form 137Cs in the decontaminated area in the headwaters and in the upper reaches of the river. However, no clear effect of decontamination was observed in the concentrations of dissolved Cs-137 and Cs-137 in coarse organic matter. Comparison of the slopes of Cs-137 concentrations in the suspended, dissolved, and coarse organic matter showed that the slope of the dissolved form was similar to that of the coarse organic matter in the source watersheds, and similar to that of the SS in the downstream watersheds. These results suggest that the contribution of dissolved Cs-137 from organic matter in small watersheds and that from suspended solids in large watersheds is significant.

How to cite: Onda, Y., Kawano, T., Taniguchi, K., and Takahashi, J.: Dynamic change of dissolved Cs-137 from headwaters to downstream in the Kuchibuto River catchment, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6026, https://doi.org/10.5194/egusphere-egu23-6026, 2023.

EGU23-10093 | Posters on site | GI2.2

Riverine 137Cs dynamics and remoralization in coastal waters during high flow events

Yoshifumi Wakiyama, Hyoe Takata, Keisuke Taniguchi, Takuya Niida, Yasunori Igarashi, and Alexei Konoplev

Understanding riverine ¹³⁷Cs dynamics during high-flow events is crucial for improving predictability of ¹³⁷Cs transportation and relevant hydrological responses. It is frequently documented that the majority of ¹³⁷Cs is exported during high-flow events triggered by intensive rainfall. Studies on ¹³⁷Cs in coastal seawater suggested that a huge high-flow events resulted in high dissolved ¹³⁷Cs concentration in seawater. Different temporal patterns of ¹³⁷Cs concentrations in river water are found in the existing literature on ¹³⁷Cs dynamics during high-flow events. Although such differences may reflect catchment characteristics, there is no comprehensive analysis for the relationships. This study explores catchment characteristics affecting ¹³⁷Cs transport via river to ocean based on datasets obtained by sampling campaigns during high-flow events. ¹³⁷Cs datasets obtained at 13 points in 6 river water systems were subject to the analysis. The analyses intended to explore relationship between catchment characteristics (scale and land use composition) and ¹³⁷Cs dynamics in terms of variations in concentration, fluxes, and potential remobilization in seawater. We could not find any significant correlations between the parameters of catchment characteristics and mean values of normalized concentrations of ¹³⁷Cs and apparent K_d. However, when approximating ¹³⁷Cs concentrations and K_d value as a power function of suspended solid concentration (Y=α X^β), the power of β in the equations for dissolved ¹³⁷Cs concentration and K_d showed negative and positive correlations with the logarithm of the watershed area, respectively, and the positive β was found when the catchment area was on the order of 100 km² or larger and vice versa. This indicates that the concentration of dissolved ¹³⁷Cs tends to decrease with increased water discharge in larger catchments for smaller catchments. These results suggest that the temporal pattern of dissolved ¹³⁷Cs concentrations depends on watershed scale. ¹³⁷Cs flux during a single event ranged from 1.9 GBq to 1.1 TBq and accounted for 0.00074% to 0.22% of total ¹³⁷Cs deposited in relevant catchments. Particulate ¹³⁷Cs flux accounted for more than 92% of total ¹³⁷Cs flux, except for Ukedo River basin with a large dam reservoir. R-factor, an erosivity index in the Universal Soil Loss Equation model family, is a good parameter for reproducing sediment discharge and particulate ¹³⁷Cs flux. Efficiency of particulate ¹³⁷Cs flux, calculated by dividing the flux by R-factor of event, tended to be high in catchments with relatively low forest cover. Desorption ratio of ¹³⁷Cs, obtained by 1-day shaking experiment of SS in seawater, ranged from 2.8 to 6.6%. The ratio was almost proportional of ratio of exchangeable ¹³⁷Cs. The estimated amounts of desorbed ¹³⁷Cs, obtained by multiplying particulate ¹³⁷Cs and the desorption ratios, were greater than direct flux of dissolved ¹³⁷Cs. Reanalysis of riverine ¹³⁷Cs dataset in high flow events is revealing relationship between catchment characteristics and ¹³⁷Cs dynamics. Further analyses, such as evaluation of decontamination impacts and inter-catchment comparisons of ¹³⁷Cs fluxes, are required for better understanding.

How to cite: Wakiyama, Y., Takata, H., Taniguchi, K., Niida, T., Igarashi, Y., and Konoplev, A.: Riverine 137Cs dynamics and remoralization in coastal waters during high flow events, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10093, https://doi.org/10.5194/egusphere-egu23-10093, 2023.

EGU23-10539 | Posters on site | GI2.2 | Highlight

Long-term dynamics of 137Cs accumulation at an urban pond

Honoka Kurosawa, Kenji Nanba, Toshihiro Wada, and Yoshifumi Wakiyama

It is known that the semi-enclosed water area such as pond and dam reservoir is readily subject to ¹³⁷Cs accumulation because of the secondary inflow from the catchment area. We present the long-term monitoring data of the ¹³⁷Cs concentration in bottom sediment and pond water in an urban pond located in the central area of Koriyama City, Fukushima Prefecture to discuss the ¹³⁷Cs dynamics of the urban pond. The pond was decontaminated by the bottom sediment removal in 2017. The bottom sediment core and pond water were collected in 2015 and 2018-2021. The inflow and outflow water were collected in 2020-2021. The river water around the pond was collected in 2021. The bottom sediment and water samples were measured for ¹³⁷Cs concentration, particulate size distribution, and N and C stable isotopes. Compared between 2015 and 2018, the ¹³⁷Cs inventory and 0-10 cm depth of ¹³⁷Cs concentration in the bottom sediment at 7 points were decreased by 81 % (mean 1.50 to 0.28 MBq/m²) and 85 % (mean 31.5 to 4.8 kBq/kgDW), respectively. Although mean ¹³⁷Cs inventory in bottom sediment did not drastically change during 2018-2021, its variability became wider. Points with increased ¹³⁷Cs inventory in bottom sediment showed year-by-year increase in thickness of layer with concentrations higher than 8 kBq/kgDW, a criterion for considered decontamination. The ¹³⁷Cs concentration in suspended solids (SS) in pond water was lowered after decontamination, although it still remained above 8 kBq/kgDW. The ¹³⁷Cs concentrations in SS of inflow water were also high, exceeding 8 kBq/kgDW. The ¹³⁷Cs concentration in SS of the river water around the pond was higher when it passed through the urban area, suggesting that the inflow of particles from urban origin maintained high ¹³⁷Cs level in the pond.

How to cite: Kurosawa, H., Nanba, K., Wada, T., and Wakiyama, Y.: Long-term dynamics of 137Cs accumulation at an urban pond, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10539, https://doi.org/10.5194/egusphere-egu23-10539, 2023.

EGU23-10868 | Posters on site | GI2.2

Estimation of annual Cesium-137 influx from the FDNPP to the coastal water

Shun Satoh and Hyoe Takata

Due to the accident at the Fukushima Daiichi Nuclear Power Plant (1F) in March 2011, radionuclides were introduced into the environment, and one of the release pathways to the ocean is the direct discharge from the 1F (on-going release). This was mainly caused immediately after the accident, but even now, the on-going release is continuing. In this study, firstly we estimated the on-going release of ¹³⁷Cs from 1F over 10 years after the accident, using the TEPCO’s ¹³⁷Cs monitoring results in the coastal area around 1F. Secondly, change in the monitoring data related to countermeasures by TEPCO (e.g. construction of iced walls) to reduce the introduction of contaminated water into the ocean or detect ¹³⁷Cs in nearby seawater, so their effects on the on-going release estimation were also discussed. A box model including inside and outside of the port was assumed for the area around 1F, and the amount of ¹³⁷Cs in the box was estimated (estimated value: modeled data). Then, the difference between the estimated value and the amount of ¹³⁷Cs obtained from actual observed concentrations (measured value: monitoring data) was calculated. The result showed that the measured value was higher than the estimated value, suggesting the on-going release from 1F. As for decrease in monitoring data after the countermeasures, it is implied that the estimation of rate of on-going release has been reduced by the countermeasures.

How to cite: Satoh, S. and Takata, H.: Estimation of annual Cesium-137 influx from the FDNPP to the coastal water, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10868, https://doi.org/10.5194/egusphere-egu23-10868, 2023.

EGU23-11671 | Posters on site | GI2.2

Changes in Cs-137 concentrations in river-bottom sediments and their factors in Fukushima Prefecture rivers

Naoyuki Wada, Yuichi Onda, Xiang Gao, and Chen Tang

The Fukushima Daiichi Nuclear Power Plant accident (FDNPP) in 2011 resulted in the release of large amounts of Cs-137 into the atmosphere. Cs-137 deposited on land was mainly distributed in forests, but some of it has been discharged to the sea through rivers. The dissolved and suspended forms of Cs-137 in rivers have been focused on, and it is known that the discharge mechanism and concentration formation of Cs-137 differ depending on the land use in the river basin. On the other hand, there are few cases that focus on the dynamics of Cs-137 in river bottom sediments. River-bottom sediment is less likely to flow downstream than suspended sediments, so contamination in the downstream area may be long-term.
We will clarify the migration mechanism of Cs-137 in rivers including river-bottom sediment.Therefore, we will analyze data collected from 2011 to 2018 in 89 watersheds in Fukushima prefecture. In analyzing the data, we removed sampling points with brackish water using electrical conductivity and corrected for particle size to standardize the surface area of particles that absorb Cs-137.As a result, it was found that unlike dissolved and suspended forms, the Cs concentration in river-bottom sediments can increase within the initial year. This is related to the average initial deposition in the watershed and the amount of initial deposition at the river-bottom sediment sampling sites, with a tendency to increase with relatively higher initial deposition in the upstream area. It was also known that the decrease in suspended Cs concentration was more pronounced when anthropogenic activities in the watershed were more active, but there was no clear relationship between land use in the watershed and changes in river-bottom sediment Cs concentration. This indicates that suspended sediment Cs concentrations are controlled by initial deposition to suspended sediment production sources, whereas river-bottom Cs concentrations are controlled by multiple factors such as sediment traction and Cs supply from river water.

How to cite: Wada, N., Onda, Y., Gao, X., and Tang, C.: Changes in Cs-137 concentrations in river-bottom sediments and their factors in Fukushima Prefecture rivers, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11671, https://doi.org/10.5194/egusphere-egu23-11671, 2023.

EGU23-12670 | ECS | Orals | GI2.2

Minimizing the loss of radioactively contaminated sediment from the Niida watershed (Fukushima, Japan) through spatially targeted afforestation.

Floris Abrams, Lieve Sweeck, Johan Camps, Grethell Castillo-Reyes, Bin Feng, Yuichi Onda, and Jos Van Orshoven

Government-led decontamination of agricultural land in the Fukushima accident (2011) region has lowered the on-site radiation risk considerably. From 2013 to early 2017, 11.9% of the land in the Fukushima disaster affected Niida watershed in Japan was remediated through topsoil removal. However, this resulted in a 237.1% increase in suspended sediment loads in the river for 2016 compared to 2013. In contrast, sediment loads decreased by 41% from 2016 to 2017; this can be attributed to the effect of natural vegetation restoration on sediment yield and transfer patterns (Bin et al., 2022). Since radiocaesium firmly binds to the clay minerals in the soil, it is inevitably transported along with the sediments downstream to the river systems. These observations confirm that rapid, spatially targeted interventions, such as revegetation, e.g., through afforestation, have the potential to decrease the magnitude and period of increased exports of contaminated sediments. The CAMF tool (Cellular Automata-based Heuristic for Minimizing Flow) (Vanegas et al., 2012) was originally designed to find the cells in a raster representation of a watershed for which afforestation would lead to a maximal reduction of sediment exports with minimal effort or cost while taking sediment flow from cell to cell into account. In our research, we adapted the CAMF tool to account for the radiocaesium budgets associated with the transported sediments. We applied the approach to the Niida catchment, where land-cover changes in upstream decontaminated regions are detected using drone imagery and linked to increased sediment loads in the Niida river using long-term river monitoring systems. For example In 2014, agricultural land (18.02 km²) was one of the major land uses in the regions where decontamination was ordered, resulting in increased sediment loads from 2014 to 2016. By recognizing both the on- and off-site impacts of the remediation interventions and their temporal dynamics, the modified CAMF tool offers scope for supporting the formulation of spatio-temporal schemes for the remediation of agricultural land. These schemes aim to decrease the radiation risk for downstream communities and minimize the potential recontamination of already decontaminated sites.

How to cite: Abrams, F., Sweeck, L., Camps, J., Castillo-Reyes, G., Feng, B., Onda, Y., and Van Orshoven, J.: Minimizing the loss of radioactively contaminated sediment from the Niida watershed (Fukushima, Japan) through spatially targeted afforestation., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12670, https://doi.org/10.5194/egusphere-egu23-12670, 2023.

EGU23-13366 | Orals | GI2.2

Similarity of long-term temporal decrease in atmospheric Cs-137 between Chernobyl and Fukushima

Kentaro Akasaki, Shu Mori, Eiichi Suetomi, and Yuko Hatano

We compare the atmospheric concentrations of Cs-137 after a decade between Chernobyl and Fukushima cases. We plotted 8 datasets on log-log axes (5 cases in Chernobyl and 3 cases Fukushima) and found that they appear to follow a single function.

There have been measured the atmospheric concentration after the Chernobyl accident for more than 30 years [1]. On the other hand, several teams of Japanese researchers have been measured in Fukushima and its vicinity for almost 10 years. [2][3] In this study, we compare 5 sites in Chernobyl (Pripyat, Chernobyl, Baryshevka, Kiev, and Polesskoe) and 3 sites in Fukushima (FDNPP O-6 and O-7, Univ. Fukushima).

We adjust the magnitude of the data because it depends on the amount of the initial deposition. After the adjustment, we plot the 8 cases on a log-log plot. We found that the 8 cases collapse together, with the power index of -1.6. Namely,

C(t) ~ t^{-1.6}. …(1)

Incidentally, we have been proposed a formula which reproduce the long-term behavior of atmospheric concentration at a fixed location as

C(t) = A exp(-bt) t^{-4/3} …(2)

where A is a parameter which relates to the amount of the initial deposition and b as the reaction rate of all the first-order reactions (including the radioactive decay rate, the vegetation uptake rate, the runoff rate, etc). We will investigate the difference in the power-law index in Eq. (1) and (2). The parameter b is highly dependent on the environment. When we take a proper value of b, the apparent decrease of the concentration will change from t^{-4/3}. We may make the apparent power-index close to -1.6.

[1] E. K. Garger, et al., J. Env. Radioact., 110 (2012) 53-58.

[2] A. Watanabe, et al., Atmos. Chem. Phys. 22 (2022) 675-692.

[3] T. Abe, K. Yoshimura, Y. Sanada, Aerosol and Air Quality Research, 21 (2021) 200636.

How to cite: Akasaki, K., Mori, S., Suetomi, E., and Hatano, Y.: Similarity of long-term temporal decrease in atmospheric Cs-137 between Chernobyl and Fukushima, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13366, https://doi.org/10.5194/egusphere-egu23-13366, 2023.

EGU23-13486 | ECS | Posters virtual | GI2.2

Distributions of tritium in the marine water and biota around Rokkasho Reprocessing Plant

Satoru Ohtsuki, Yuhei Shirotani, and Hyoe Takata

For decommissioning of Fukushima Daiichi Nuclear Power Station (FDNPS), it is one of the biggest problems to treat the radioactive contaminated stagnant water in the building. It is difficult to remove H-3 from the contaminated water by only Advanced Liquid Processing System (ALPS) treatment. Thus, the Japanese Government announced to release the ALPS treated water containing H-3. To predict the alteration of the dose rate of the marine biota by the change of H-3 concentration in marine water after the release of ALPS water, it is necessary to understand the dynamics of H-3 in marine ecosystem. In this study, we studied the behavior of H-3 in the marine environment (water and biota) off Aomori and Iwate prefectures from FY2003 to FY2012, as the background data of the Pacific Ocean along the coast of the North East Japan. To clarify the dynamics of H-3 in marine biota, we compared H-3 and Cs-137. Excluding the period of the intermittent test operation of the Rokkasho Reprocessing Plant (FY2006-FY2008), the concentration of H-3 in seawater, tissue free water tritium (TFWT) and organically bound tritium (OBT) were 0.052-0.20 Bq/L with a mean of 0.12±0.031 Bq/L, 0.050-0.34 Bq/kg-wet with a mean of 1.1±0.039 Bq/kg-wet and 0.0070-0.099 Bq/kg-wet with a mean of 0.042±0.019 Bq/kg-wet, respectively. Before the FDNPS accident (FY2003-FY2010), Cs-137 concentration in seawater and marine biota were 0.00054-0.0027 Bq/L with a mean of 0.0016±0.00041 Bq/L and 0.022-1.8 Bq/kg-wet with a mean of 0.090±0.037 Bq/kg-wet, respectively. Concentration Ratio (CR), the ratio of the concentration of marine biota and seawater for TFWT, was to be 0.34-2.37 with a mean of 0.97±0.31 in all spices, meaning the concentration of marine biota was almost equal to seawater. For Cs-137, CR were 46-78 with a mean of 56±22. We compared CRs for TFWT of Gadus macrocephalus, Lophius litulon and Oncorhynchus keta with those of Cs-137. Comparing CR-TFWT and CR-Cs-137 for these three species, Spearman-R was <0.4 and p was >0.05, indicating that the dynamics of TFWT and Cs-137 in marine ecology is decoupled.

How to cite: Ohtsuki, S., Shirotani, Y., and Takata, H.: Distributions of tritium in the marine water and biota around Rokkasho Reprocessing Plant, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13486, https://doi.org/10.5194/egusphere-egu23-13486, 2023.

EGU23-15515 | Posters on site | GI2.2

137Cs transport flux to surface water due to shallow groundwater discharge from forest hillslope

Yuma Niwano, Hiroaki Kato, Satoru Akaiwa, Donovan Anderson, Hikaru Iida, Miyu Nakanishi, Yuichi Onda, Hikaru Sato, and Tadafumi Niizato

Groundwater systems and surface water can interact in a complex manner that influences catchment discharge, which then becomes more complex in forest slopes. A large amount of Radioactive cesium (¹³⁷Cs) deposited on forests due to the Fukushima Daiichi Nuclear Power Plant accident remains in terrestrial environments and is transported downstream as suspended or dissolved forms by surface water. Generally, the concentration of dissolved ¹³⁷Cs in surface water increases especially during runoff. While the leaching behavior of ¹³⁷Cs from contaminated forest materials and soils to surface water has been heavily studied, the influence of ¹³⁷Cs concentration in shallow groundwater systems in forest slopes have not been investigated. Therefore, detailed hydrological observations of groundwater on a forest hillslope will enable quantitative analysis of the influence of groundwater flow on the formation of dissolved ¹³⁷Cs concentrations in surface water during base flow and during runoff. Our results showed that the dissolved ¹³⁷Cs concentration in surface water increases during water discharge. The average concentration of dissolved ¹³⁷Cs in shallow groundwater was 0.64 Bq/L, which was higher than that in surface water (average 0.10 Bq/L). Furthermore, it was also observed that a part of the shallow groundwater on the slope moves toward the river channel at the time of water runoff. This suggests that shallow groundwater may have flowed into the surface water during the outflow and contributed to the increase of ¹³⁷Cs in the surface water. In this study, the contribution of groundwater in forest slopes to the dissolved ¹³⁷Cs concentration in surface water was estimated using the hydrodynamic gradient distribution of groundwater in forest slopes and the measured dissolved ¹³⁷Cs concentration in groundwater.

How to cite: Niwano, Y., Kato, H., Akaiwa, S., Anderson, D., Iida, H., Nakanishi, M., Onda, Y., Sato, H., and Niizato, T.: 137Cs transport flux to surface water due to shallow groundwater discharge from forest hillslope, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15515, https://doi.org/10.5194/egusphere-egu23-15515, 2023.

ESSI2 – Data, Software and Computing Infrastructures across Earth and space sciences

EGU23-2454 | ECS | Orals | ESSI2.2

The future of NASA Earth Science in the commercial cloud: Challenges and opportunities

Alexey Shiklomanov, Manil Maskey, Yoseline Angel, Aimee Barciauskas, Philip Brodrick, Brian Freitag, and Jonas Sølvsteen

NASA produces a large volume and variety of data products that are used every day to support research, decision making, and education. The widespread use of NASA’s Earth Science data is enabled by NASA’s Earth Science Data System (ESDS) program, which oversees the archiving and distribution of these data and invests in the development of new data systems and tools. However, NASA’s current approach to Earth Science data distribution — based on distributed institutional archives with individual on-premises high-performance computing capabilities — faces some significant challenges, including massive increases in data volume from upcoming missions, a greater need for transdisciplinary science that synthesizes many different kinds of observations, and a push to make science more open, inclusive, and accessible. To address these challenges, NASA is aggressively migrating its Earth Science data and related tools and services into the commercial cloud. Migration of data into the commercial cloud can significantly improve NASA’s existing data system capabilities by (1) providing more flexible options for storage and compute (including rapid, as-needed access to state-of-the-art capabilities); (2) by centralizing and standardizing data access, which gives all of NASA’s institutional data centers access to all of each other’s datasets; and (3) by facilitating “analysis-in-place”, whereby users can bring their own computational workflows and tools to the data rather than having to maintain their own copies of NASA datasets. However, migration to the commercial cloud also poses some significant challenges, including (1) managing costs under a “pay-as-you-go” model; (2) incompatibility with existing tools and data formats with object-based storage and network access; (3) vendor lock-in; (4) challenges with data access for workflows that mix on-premise and cloud computing; and (5) standardization for highly diverse data as is present in NASA’s data archive. I conclude with two examples of recent NASA activities showcasing capabilities enabled by the commercial cloud: An interactive analysis and development platform for analyzing airborne imaging spectroscopy data, and a new collection of tools and services for data discovery, analysis, publication, and data-driven storytelling (Visualization, Exploration, and Data Analysis, VEDA).

How to cite: Shiklomanov, A., Maskey, M., Angel, Y., Barciauskas, A., Brodrick, P., Freitag, B., and Sølvsteen, J.: The future of NASA Earth Science in the commercial cloud: Challenges and opportunities, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2454, https://doi.org/10.5194/egusphere-egu23-2454, 2023.

EGU23-3657 | Orals | ESSI2.2

CADS 2.0: A FAIRest Data Store infrastructure blooming in a landscape of Data Spaces.

Angel lopez alos, Baudouin raoult, Edward comyn-platt, and James varndell

First launched as the Climate Data Store (CDS) supporting the Climate Change Service (C3S) and later instantiated as the Atmosphere Data Store (ADS) for the Atmosphere Monitoring Service (CAMS), the shared underlaying Climate & Atmosphere Data Store Infrastructure (CADS) represents the technical backbone for the implementation of Copernicus services entrusted to ECMWF on behalf of the European Commission. CDS in addition also offer access to a selection of datasets from the Emergency Management Service (CEMS). As the flagship instance of the infrastructure, CDS counts with more than 160k registered users and delivers a daily average over 100 TBs of data from a catalogue of 141 datasets.

CADS Software Infrastructure is designed as a distributed system and open framework that facilitates improved access to a broad spectrum of data and information via a powerful service-oriented architecture offering seamless web-based and API-based search and retrieve capabilities. CADS also provides a generic software toolbox that allow users to make use of available datasets and a series of state-of-the-art data tools that can be combined into more elaborated processes, and present results graphically in the form of interactive web applications. CADS Infrastructure is hosted in an on-premises Cloud physically located within ECMWF Data Centre and implemented using a collection of virtual machines, networks and large data volumes. Fully customized instances of CADS, including dedicated Virtual Hardware Infrastructure, Software Application and Catalogued content can be easily deployed thanks to implemented automatization and configuration software tools and a set of configuration files which are managed by a distributed version control system. Tailored scripts and templates allow to easily accommodate different standards and interoperate with external platforms.

ECMWF in partnership with EUMETSAT, ESA and EEA also implement the Data and Information Access Services (DIAS) platform called WEkEO, a distributed cloud-computing infrastructure used to process and make the data generated by Copernicus Services accessible to users together with derived products and all satellite data from the Copernicus Sentinels. Within the partnership ECMWF is responsible for the procurement of the software to implement Data Access Services, Processing and Tools which specifications build on the same fundamentals than CADS. Adoption of FAIR principles has demonstrated cornerstone to maximize synergies and interactions between CADS, WEkEO and other related platforms.

Driven by the increasing demand and the evolving landscape of platforms and services a major project for the modernization of the CADS infrastructure is currently underway. The coming CADS 2.0 aims to capitalize experience, feedbacks, lesson learned, know-how from current CADS, embrace advanced technologies, engage with a broader user community, make the current platform more versatile and cloud oriented, improve workflows and methodologies, ensure compatibility with state-of-the-art solutions such as machine learning, data cubes and interactive notebooks, consolidate the adoption of FAIR principles and strength synergies with related platforms.

As complementary Infrastructures, WEkEO will allow users to harness compute resources without the networking and storage costs associated with public Cloud offerings in where CADS Toolbox 2.0 will deploy and run allowing heavy jobs (retrieval and reduction) to be submitted to CADS 2.0 core infrastructure as services.

How to cite: lopez alos, A., raoult, B., comyn-platt, E., and varndell, J.: CADS 2.0: A FAIRest Data Store infrastructure blooming in a landscape of Data Spaces., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3657, https://doi.org/10.5194/egusphere-egu23-3657, 2023.

EGU23-5038 | Posters on site | ESSI2.2

EO4EU - AI-augmented ecosystem for Earth Observation data accessibility with Extended reality User Interfaces for Service and data exploitation

Vasileios Baousis, Stathes Hadjiefthymiades, Charalampos Andreou, Kakia Panagidh, and Armagan Karatosun

EO4EU is a European Commission-funded innovation project bringing forward the EO4EU Platform which will access and use of EO data easier for environmental, government, and even business forecasts and operations.

The EO4EU Platform, which will be accessible at www.eo4eu.eu, will link already existing major EO data sources such as GEOSS, INSPIRE, Copernicus, Galileo, DestinE among others and provide a number of tools and services to assist users to find and access the data they are interested in, as well as to analyse and visualise this data. The platform will leverage machine learning to support the handling of the characteristically large volume of EO data as well as a combination of Cloud computing infrastructure and pre-exascale high-performance computing to manage processing workloads.

Specific attention is also given to developing user-friendly interfaces for EO4EU allowing users to intuitively use EO data freely and easily, even with the use of extended reality.

EO4EU objectives are:

Holistic DataOps ecosystem to enhance access and usability of EO information.
A semantic-enhanced knowledge graph that augments the FAIRness of EO data and supports sophisticated data representation and dynamics.
A machine learning pipeline that enables the dynamic annotation of the various EO data sources.
Efficient, reliable and interoperable inter- and intra- data layer communications
Advance stakeholders’ knowledge capacity through informed decision-making and policy-making support.
A full range of use case scenarios addressing current data needs, capitalizing existing digital services and platforms, fostering their usability and practicality, and taking into account ethical aspects aiming at social impact maximization.

Technical and scientific innovation can be summarised as follows:

Improve compression rates for image quality and reduce data volumes.
Improve the quality of reconstructed compressed images, maintaining the same comparison rates
Facilitate the design of custom services with a minimized labelled data requirement
Learn robust and transferable representations of EO data
Publishing original trained models on EO data with all relevant assisting material to support reusability in a public repository.
Data fusion optimized execution in HPC and GPU environment
Better accuracy of data representation
Customizable visualization tools tailored to the needs of each use case
Dedicated graphs for end-users with various granularities, modalities, metrics and statistics to observe the overall trends in time, correlations, and cause-and-effect relationships through a responsive web-interfaced module.

In this presentation, the status of the project, the adopted architecture and the findings from our initial user surveys pertaining to EO data access and discovery will be analysed. Finally, the next steps of the project, the early access to the developed platform and the challenges and opportunities will be discussed.

How to cite: Baousis, V., Hadjiefthymiades, S., Andreou, C., Panagidh, K., and Karatosun, A.: EO4EU - AI-augmented ecosystem for Earth Observation data accessibility with Extended reality User Interfaces for Service and data exploitation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5038, https://doi.org/10.5194/egusphere-egu23-5038, 2023.

EGU23-5862 | Orals | ESSI2.2

The ESA Green Transition Information Factories – using Earth Observation and cloud-based analytics to address the Green Transition information needs.

Patrick Griffiths, Stefanie Lumnitz, Christian Retscher, Frank-Martin Seifert, and Yves-Louis Desnos

In response to the global climate and sustainability crisis, many countries have expressed ambitions goals in terms of carbon neutrality and a green economy. In this context, the European Green Deal comprises several policy elements aimed to achieve carbon neutrality by 2050.

In response to these ambitions, the European Space Agency (ESA) is initiating various efforts to leverage on space technologies and data and support various Green Deal ambitions. The ESA Space for Green Future (S4GF) Accelerator will explore new mechanisms to promote the use of space technologies and advanced modelling approaches for scenario investigations on the Green Transition of economy and society.

A central element of the S4GF accelerator are the Green Transition Information Factories (GTIF). GTIF takes advantage of Earth Observation (EO) capabilities, geospatial and digital platform technologies, as well as cutting edge analytics to generate actionable knowledge and decision support in the context of the Green Transition.

A first national scale GTIF demonstrator has now been developed for Austria.
It addressed the information needs and national priorities for the Green Deal in Austria. This is facilitated through a bottom-up consultation and co-creation process with various national stakeholders and expert entities. These requirements are matched with various EO industry teams that

The current GTIF demonstrator for Austria (GTIF-AT) builds on top of federated European cloud services, providing efficient access to key EO data repositories and rich interdisciplinary datasets. GTIF-AT initially addresses five Green Transition domains: (1) Energy Transition, (2) Mobility Transition, (3) Sustainable Cities, (4) Carbon Accounting and (5) EO Adaptation Services.

For each of these domains, scientific narratives are provided and elaborated using scrollytelling technologies. The GTIF interactive explore tools allow various users to explore the domains and subdomains in more detail to investigate better understand the challenges, complexities, and underlying socio-economic and environmental conflicts. The GTIF interactive explore tools combine domain specific scientific results with intuitive Graphical User Interfaces and modern frontend technologies. In the GTIF Energy Transition domain, users can interactively investigate the suitability of locations at 10m resolution for the expansion of renewable (wind or solar) energy production. The tools also allow investigating the underlying conflicts e.g., with existing land uses or biodiversity constraints. Satellite based altimetry is used to dynamically monitor the water levels in hydro energy reservoirs to infer the related energy storage potentials. In the sustainable cities’ domain, users can investigate the photovoltaic installments on rooftops and assess the suitability in terms of roof geometry and expected energy yields.

GTIF enables various users to inform themselves and interactively investigate the challenges but also opportunities related to the Green Transition ambitions. This enables, for example, citizens to engage in the discussion process for the renewable energy expansion or support energy start-ups to develop new services. The GTIF development follows an open science and open-source approach and several new GTIF instances are planned for the next years, addressing the Green Deal information needs and accelerating the Green Transition. This presentation will showcase some of the GTIF interactive explore tools and provide an outlook on future efforts.

How to cite: Griffiths, P., Lumnitz, S., Retscher, C., Seifert, F.-M., and Desnos, Y.-L.: The ESA Green Transition Information Factories – using Earth Observation and cloud-based analytics to address the Green Transition information needs., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5862, https://doi.org/10.5194/egusphere-egu23-5862, 2023.

EGU23-5936 | Orals | ESSI2.2

What does the European Spatial Data Infrastructure INSPIRE need in order to become a Green Deal Data Space?

Joan Masó, Alba Brobia, Ivette Serral, Ingo Simonis, Francesca Noardo, Lucy Bastin, Carlos Cob Parro, Joaquín García, Raul Palma, and Sébastien Ziegler

In May 2007, the INSPIRE directive established the path towards creating the European Spatial Data Infrastructure (ESDI). While the Joint Research Centre (JRC) defined a set of detailed implementation guidelines, the European member states determined the agencies responsible for delivering the different topics specified in the directive’s annexes. INSPIRE’s goal was - and still is - to organize and share Europe’s data supporting environmental policies and actions. However, the way that INSPIRE was defined limited contributions to the public sector, and limited topics to those specifically listed in its annexes. Technical challenges and a lack of appropriate tools have impeded INSPIRE from implementing its own guidelines, and even after 15 years, the dream of a continuous, consistent description of Europe’s environment has still not completely materialized. We should apply the lessons learnt in INSPIRE when we build the Green Deal Data Space (GDDS). To create the GDDS, we should start with ESDI (the European Spatial Data Infrastructure), but also engage and align with the ongoing preparatory actions for data spaces (e.g., for green deal and agriculture) as well as include actors and networks that have emerged or been organized in the recent years. These include: networks of in situ observations (e.g. the Environmental Research Infrastructures (ENVRI) community); Citizen Science initiatives (such as the biodiversity observations integrated in the Global Biodiversity Information Facility (GBIF), or sensor communities for e.g. air quality); predictive algorithms and machine learning models and simulations based on artificial intelligence (such as the ones deployed in the European Open Science Cloud, International Data Space Association and Gaia-X; services driven both by the scientific community and the private sector); remote sensing derived products developed by the Copernicus Services. Most of these data providers have already embraced the FAIR principles and open data, providing many examples of best practice which can assist newer adopters on the path to open science. In the Horizon Europe project AD4GD (AllData4GreenDeal), we believe that, instead of trying to force data producers to adopt cumbersome new protocols, we should take advantage of the latest developments in geospatial standards and APIs. These allow loosely coupled but well documented and interlinked data sources and models in the GDDS while achieving scientifically robust integration and easy access to data in the resulting workflows. Another fundamental element will be the adoption of a common and extensible information model enabling the representation and exchange of Green Deal related data in an unambiguous manner, including vocabularies for Essential Variables to organize the observable measurements and increase the level of semantic interoperability. This will allow systems and components from different technology providers to seamless interoperate and exchange data, and to have an integrated view and access to exploit the full value of the available data. The project will validate the approach in three pilot cases: water quality and availability of Berlin lakes, biodiversity corridors in the metropolitan area of Barcelona and low cost air quality sensors in Europe. The AD4GD project is funded by the European Union under the Horizon Europe program.

How to cite: Masó, J., Brobia, A., Serral, I., Simonis, I., Noardo, F., Bastin, L., Cob Parro, C., García, J., Palma, R., and Ziegler, S.: What does the European Spatial Data Infrastructure INSPIRE need in order to become a Green Deal Data Space?, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5936, https://doi.org/10.5194/egusphere-egu23-5936, 2023.

EGU23-7052 | Orals | ESSI2.2

FAIRiCUBE: Enabling Gridded Data Analysis for All

Katharina Schleidt and Stefan Jetschny

Previously, collecting, storing, owning and, if necessary, digitizing data was vital for any data-driven application. Nowadays, we are swimming in data, whereby one could postulate that we are drowning. However, downloading vast data to local storage and subsequent in-house processing on dedicated hardware is inefficient and not in line with the big data processing philosophy. While the FAIR principles are fulfilled as the data is findable, accessible, and interoperable, the actual reuse of the data to gain new insights depends on the data user’s local capabilities. Scientists aware of the potentially available data and processing capabilities are still not able to easily leverage these resources as required to perform their work; while the analysis gap entailed by the information explosion is being increasingly highlighted, remediation lags.

The core objective of the FAIRiCUBE project is to enable players from beyond classic Earth Observation (EO) domains to provide, access, process, and share gridded data and algorithms in a FAIR and TRUSTable manner. To reach this objective, we are creating the FAIRiCUBE HUB, a crosscutting platform and framework for data ingestion, provision, analysis, processing, and dissemination, to unleash the potential of environmental, biodiversity and climate data through dedicated European data spaces.

In order to gain a better understanding of the various obstacles to leveraging available assets in regard to both data as well as analysis and processing modalities, several use cases have been defined addressing diverse aspects of European Green Deal (EGD) priority actions. Each of the use cases has a defined objective, approach, research question and data requirements.

The use cases selected to guide the creation of the FAIRiCUBE HUB are as follows:

Urban adaptation to climate change
Biodiversity and agriculture nexus
Biodiversity occurrence cubes
Drosophila landscape genomics
Spatial and temporal assessment of neighborhood building stock

Many of the issues encountered within the FAIRiCUBE project are formally considered solved. Catalogues are available detailing the available datasets, standards define how the datasets are to be structured and annotated with the relevant metainformation. A vast array of processing functionality has emerged that can be applied to such resources. However, while all this is considered state-of-the-art in the EO community, there is a subtle delta blocking access to wider communities that could make good use of the available resources pertaining to their own domains of work. These include, but are not limited to:

Identifying available data sources
Determining fitness for use
Interoperability of data with divergent spatiotemporal basis
Understanding access modalities
Scoping required resources
Providing non-gridded data holdings in a gridded manner

There is great potential in integrating the diverse gridded resources available from EO sources within wider research domains. However, at present, there are subtle barriers blocking this potential. Within FAIRiCUBE, these issues are being collected and evaluated, mitigation measures are being explored together with researchers not from traditional EO domains, with the goal of breaking down these barriers, and enabling powerful research and data analysis potential to a wide range of scientists.

How to cite: Schleidt, K. and Jetschny, S.: FAIRiCUBE: Enabling Gridded Data Analysis for All, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7052, https://doi.org/10.5194/egusphere-egu23-7052, 2023.

EGU23-7074 | ECS | Posters on site | ESSI2.2

An EOSC-enabled Data Space environment for the climate community

Fabrizio Antonio, Donatello Elia, Guillaume Levavasseur, Atef Ben Nasser, Paola Nassisi, Alessandro D'Anca, Alessandra Nuzzo, Sandro Fiore, Sylvie Joussaume, and Giovanni Aloisio

The exponential increase in data volumes and complexities is causing a radical change in the scientific discovery process in several domains, including climate science. This affects the different stages of the data lifecycle, thus posing significant data management challenges in terms of data archiving, access, analysis, visualization, and sharing. The data space concept can support scientists' workflow and simplify the process towards a more FAIR use of data.

In the context of the European Open Science Cloud (EOSC) initiative launched by the European Commission, the ENES Data Space (EDS) represents a domain-specific implementation of the data space concept. The service, developed in the frame of the EGI-ACE project, aims to provide an open, scalable, cloud-enabled data science environment for climate data analysis on top of the EOSC Compute Platform. It is accessible in the European Open Science Cloud (EOSC) through the EOSC Catalogue and Marketplace (https://marketplace.eosc-portal.eu/services/enes-data-space) and it also provides a web portal (https://enesdataspace.vm.fedcloud.eu) including information, tutorials and training materials on how to get started with its main features.

The EDS integrates into a single environment ready-to-use climate datasets, compute resources and tools, all made available through the Jupyter interface, with the aim of supporting the overall scientific data processing workflow. Specifically, the data store linked to the ENES Data Space provides access to a multi-terabyte set of variable-centric collections from large-scale global climate experiments. The data pool consists of a mirrored subset of CMIP (Coupled Model Intercomparison Project) datasets from the ESGF (Earth System Grid Federation) federated data archive, collected and kept synchronized with the remote copies by using the Synda tool developed within the scope of the IS-ENES3 H2020 project. Community-based, open source frameworks (e.g., Ophidia) and libraries from the Python ecosystem provide the capabilities for data access, analysis and visualisation. Results and experiment definitions (i.e., Jupyter Notebooks) can be easily shared among users promoting data sharing and application re-use towards a more Open Science approach.

An overview of the data space capabilities along with the key aspects in terms of data management will be presented in this work.

How to cite: Antonio, F., Elia, D., Levavasseur, G., Ben Nasser, A., Nassisi, P., D'Anca, A., Nuzzo, A., Fiore, S., Joussaume, S., and Aloisio, G.: An EOSC-enabled Data Space environment for the climate community, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7074, https://doi.org/10.5194/egusphere-egu23-7074, 2023.

EGU23-7786 | Posters on site | ESSI2.2

Constructing a Searchable Knowledge Repository for FAIR Climate Data

Mark Roantree, Branislava Lalić, Stevan Savić, Dragan Milošević, and Michael Scriney

The development of a knowledge repository for climate science data is a multidisciplinary effort between the domain experts (climate scientists), data engineers who's skills include design and building a knowledge repository, and machine learning researchers who provide expertise on data preparation tasks such as gap filling and advise on different machine learning models that can exploit this data.

One of the main goals of the CA20108 cost action is to develop a knowledge portal that is fully compliant with the FAIR principles for scientific data management. In the first year, a bespoke knowledge portal was developed to capture metadata for FAIR datasets. Its purpose was to provide detailed metadata descriptions for shareable micro-meteorological (micromet) data using the WMO standard. While storing Network, Site and Sensor metadata locally, the system passes the actual data to Zenodo, receives back the DOI and thus, creates a permanent link between the Knowledge Portal and the storage platform Zenodo. While the user searches the Knowledge portal (metadata), results provide both detailed descriptions and links to data on the Zenodo platform. Our adherence to FAIR principles are documented below:

Findable. Machine-readable metadata is required for automatic discovery of datasets and services. A metadata description is supplied by the data owners for all micro-meteorological data shared on the system which subsequently drives the search engine, using keywords or network, site and sensor search terms.
Accessible. When suitable datasets have been identified, access details should be provided. Assuming data is freely accessible, Zenodo DOIs and links are provided for direct data access.
Interoperable. Data interoperability means the ability to share and integrate data from different users and sources. This can only happen if a standard (meta)data model is employed to describe data, an important concept which generally requires data engineering skills to deliver. In the knowledge portal presented here, the WMO guide provides the design and structure for metadata.
Reusable. To truly deliver reusability, metadata should be expressed in as detailed a manner as possible. In this way, data can be replicated and integrated according to different scientific requirements. While the Knowledge Portal facilitates very detailed metadata descriptions, not all metadata is compulsory as it was accepted that in some cases, the overhead in providing this information can be very costly.

Simple analytics are in place to monitor the volume and size of networks in the system. Current metrics include: network count; average size of network (number of sites); dates and size of datasets per network/site; numbers and types of sensors in each site, etc. The current Portal is in Beta version meaning that the system is currently functional but open only to members of the Cost Action who are nominated testers. This status is due to change in Q1/2023 when access will be open to the wider climate science community.

Current plans include new Tools and Services to assess the quality of data, including the level of gaps and in some cases, machine learning tools will be provided to attempt gap filling for datasets meeting certain requirements.

How to cite: Roantree, M., Lalić, B., Savić, S., Milošević, D., and Scriney, M.: Constructing a Searchable Knowledge Repository for FAIR Climate Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7786, https://doi.org/10.5194/egusphere-egu23-7786, 2023.

EGU23-7842 | Orals | ESSI2.2 | Highlight

Destination Earth - Processing Near Data and Massive Data Handling

Danaele Puechmaille, Jordi Duatis Juarez, Miruna Stoicescu, Michael Schick, and Borys Saulyak

Destination Earth is an operational service under the lead of the European Commission being implemented jointly by ESA, ECMWF and EUMETSAT.

The presentation will provide insights of how Destination Earth provides Near Data Processing and deals with Massive Data.

The objective of the European Commission’s Destination Earth (DestinE) initiative is to deploy several highly accurate digital replicas of the Earth (Digital Twins) in order to monitor and simulate natural as well as human activities and their interactions, to develop and test “what-if” scenarios that would enable more sustainable developments and support European environmental policies. DestinE addresses the challenge to manage and make accessible the sheer amount of data generated by the Digital Twins and observation data located at external sites such as the ones depicted in the figure below. This data will be made available fast enough and in a format ready to support analysis scenarios proposed by the DestinE service users.

Figure 1 : DestinE Data Sources (green) and Stakeholders (orange)

The “DestinE Data Lake” (DEDL) is one of the three Destination Earth components interacting with:

the Digital Twin Engine (DTE), which runs the simulation models, under ECMWF responsibility
the DestinE Core Service Platform (DESP), which represents the user entry point to the DestinE services and data, under ESA responsibility

The DestinE Data Lake (DEDL) fulfils the storage and access requirements for any data that is offered to DestinE users. It provides users with a seamless access to the datasets, regardless of data type and location. Furthermore, the DEDL supports big data processing services, such as near-data processing to maximize throughput and service scalability. The data lake is built inter alia upon existing data lakes such as Copernicus DIAS, ESA, EUMETSAT, ECMWF as well as complementary data from diverse sources like federated data spaces, in-situ or socio-economic data. The DT Data Warehouse is a sub-component of the DEDL which stores relevant subsets of the output from each digital twin (DT) execution being powered by ECMWFs Hyper-Cube service.

During the session, EUMETSAT’s representative will share to the community how the Destination Earth Data Lake component implements and takes advantage of Near Data Processing and also how the System handles massive data access and exchange. The Destination Earth Data Portfolio will be presented.

Figure 2: Destination Earth Data Portfolio

How to cite: Puechmaille, D., Duatis Juarez, J., Stoicescu, M., Schick, M., and Saulyak, B.: Destination Earth - Processing Near Data and Massive Data Handling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7842, https://doi.org/10.5194/egusphere-egu23-7842, 2023.

EGU23-8041 | Posters on site | ESSI2.2

Coastal Digital Twins: building knowledge through numerical models and IT tools

Anabela Oliveira, André B. Fortunato, Gonçalo de Jesus, Marta Rodrigues, and Luís David

Digital Twins integrate continuously, in an interactive, two-way data connection, the real and the virtual assets. They provide a virtual representation of a physical asset enabled through data and models and can be used for multiple applications such as real-time forecast, system optimization, monitoring and controlling, and support enhanced decision making. These recent tools take advantage of the huge online volume of data streams provided by satellites, IoT sensing and many real time surveillance platforms, and the availability of powerful computational resources that made process-solving, high resolution models or AI-based models possible, to build high accuracy replicas of the real world.
In this paper, the adaptation of the concept of Digital Twins is extended from the ocean to the coastal zones, handling the high non-linear physics and the complexity of monitoring these regions, using the on-demand coastal forecast framework OPENCoastS (Oliveira et al., 2020; Oliveira et al., 2021) to build a user-centered data spaces where multiple services, from early-warning tools to collaboratory platforms, are customized to meet the users needs. Computational effort and data requirements for these services is high, integration of Coastal Digital Twins in federated computational infrastructures, such as European Open Science Cloud (EOSC) or INCD in Portugal, to guarantee the capacity to serve multiple users simultaneously.

This tool is demonstrated in the coastal area of Albufeira, located in the southern part of Portugal, in the scope of the SINERGEA innovation project. Coastal cities face growing challenges from flooding, sea water quality and energy sustainability, which increasingly require an intelligent, real-time management. The urban drainage infrastructures transport to the wastewater treatment plants all waters likely to pollute downstream beaches. Real-time tools are required to support the assessment and prediction of the quality of bathing waters, to assess the possible need to prohibit beach water usage. During heavy rainfall events, a decentralized management systems can also contribute to mitigate downstream flooding. This requires the operation of the entire system to be optimized depending on the specific environmental conditions, and the participation and access to all the information by the several stakeholders. This system integrates real-time information provided by different entities, including monitoring networks, infrastructure operation data and a forecasting framework. The forecasting system includes several models covering all relevant water compartments: atmospheric, rivers and streams, urban stormwater and wastewater infrastructure, and receiving coastal water bodies circulation and water quality predictions.

References

A. Oliveira, A.B. Fortunato, M. Rodrigues, A. Azevedo, J. Rogeiro, S. Bernardo, L. Lavaud, X. Bertin, A. Nahon, G. Jesus, M. Rocha, P. Lopes, 2021. Forecasting contrasting coastal and estuarine hydrodynamics with OPENCoastS, Environmental Modelling & Software, Volume 143,105132, ISSN 1364-8152, https://doi.org/10.1016/j.envsoft.2021.105132.

A. Oliveira, A.B. Fortunato, J. Rogeiro, J. Teixeira, A. Azevedo, L. Lavaud, X. Bertin, J. Gomes, M. David, J. Pina, M. Rodrigues, P. Lopes, 2019. OPENCoastS: An open-access service for the automatic generation of coastal forecast systems, Environmental Modelling & Software, Volume 124, 104585, ISSN 1364-8152, https://doi.org/10.1016/j.envsoft.2019.104585.

How to cite: Oliveira, A., B. Fortunato, A., de Jesus, G., Rodrigues, M., and David, L.: Coastal Digital Twins: building knowledge through numerical models and IT tools, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8041, https://doi.org/10.5194/egusphere-egu23-8041, 2023.

EGU23-8394 | Posters on site | ESSI2.2

Spatio-Temporal Datacube Infrastructures as a Basis for Targeted Data Spaces

Peter Baumann

Data Spaces promise an innovative packaging of data and services into targeted one-stop shops of insight. A key ingredient for the fulfilment of the Data Space promise is easier, analysis-ready and fit-for-purpose access in particular to the Big Data which EO pixels and voxels constitute. Datacubes have proven to offer suitable service concepts and today are considered an acknowledge cornerstone.

In the GAIA-X EO Expert Group, a subgroup of the Geoinformation Working Group, one of the use cases investigated is the EarthServer federation. It bridges a seeming contradiction: a decentralized approach of independent data providers - with heterogeneous offerings, paid as well as free - versus
a single, common pool of datacubes where users do not need to know where data sit inorder to access, analyse, mix, and match them. Currently, a total of 140+ Petabyte is online available.

Membership in EarthServer is open and free, with a Charter being finalized ensuring transparent and democratic governance (one data provider - one vote). EarthServer thereby presents a key building block for the forthcoming Data Spaces: not only does it allow unifying data within a given Data Space, it also acts as a natural enabler for bridging and integrating different Data Spaces. This is amplified by the fact that the technology underlying EarthServer
is both the OGC datacube reference implementation and the INSPIRE Good Practice.

In our talk we present concept and practice of location-transparent datacube federations, exemplified by EarthServer, and its opportunities for future-directed Data Spaces.

How to cite: Baumann, P.: Spatio-Temporal Datacube Infrastructures as a Basis for Targeted Data Spaces, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8394, https://doi.org/10.5194/egusphere-egu23-8394, 2023.

EGU23-8788 | Orals | ESSI2.2 | Highlight

Towards the European Green Deal Data Space

Marta Gutierrez David, Mark Dietrich, Nevena Raczko, Sebastien Denvil, Mattia Santoro, Charis Chatzikyriakou, and Weronika Borejko

The European Commission has a program to accelerate the Digital Transition and is putting forward a vision based on cloud, common European Data Spaces and AI. As the data space paradigm unfolds across Europe, the Green Deal Data Space emerges. Its foundational pillars are to be built by the GREAT project.

Data Spaces will be built over federated data infrastructures with common technical requirements (where possible) taking into account existing data sharing initiatives. Services and middleware developed to enable a federation of cloud-to-edge capacities will be at the disposal of all data spaces.

GREAT, the Green Deal Data Space Foundation and its Community of Practice, has the ambitious goal of defining how data with the potential to help combat climate and environmental related challenges, in line with the European Green Deal, can be shared more broadly among many stakeholders, sectors and boundaries, according to European values such as fair access, privacy and security.

The project will consider and incorporate community defined requirements and use case analyses to ensure that the resulting data space infrastructure is designed and built with and for the users.

An implementation roadmap will guide the efforts of multiple actors to converge toward a blueprint technical architecture, a data governance scheme that enables innovative business cases, and an inventory of high value datasets, that will enable proof of concept, implementation and scale-up of a minimum viable green deal data space.This roadmap will identify the resources and other key ingredients needed for the Green Deal Data Space to be successful. Data sharing by design and data sovereignty are some of the main principles that will apply from the early stages ensuring cost effective and sustainable infrastructures that will drive Europe towards a single data market and green economic growth.

This talk will present how to engage with the project, the design methodology, progress towards the roadmap for deployment and the collaborative approach to building data spaces in conjunction with all the sectoral data spaces and the Data Space Support Centre.

How to cite: Gutierrez David, M., Dietrich, M., Raczko, N., Denvil, S., Santoro, M., Chatzikyriakou, C., and Borejko, W.: Towards the European Green Deal Data Space, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8788, https://doi.org/10.5194/egusphere-egu23-8788, 2023.

EGU23-9291 | Orals | ESSI2.2

Harmonising the sharing of marine observation data considering data quality information

Simon Jirka, Christian Autermann, Joaquin Del Rio Fernandez, Markus Konkol, and Enoc Martínez

Marine observation data is an important source of information for scientists to investigate the state of the ocean environment. In order to use data from different sources it is critical to understand how the data was acquired. This includes not only information about the measurement process and data processing steps, but also details on data quality and uncertainty. The latter aspect becomes especially important if data from different types of instruments shall be used. An example for this is the combined use of expensive high-precision instruments in conjunction with lower-cost but less precise instruments in order to densify observation networks.

Within this contribution we will present the work of the European MINKE project which intends, among further objectives, to facilitate the quality-aware and interoperable exchange of marine observation data.

For this purpose, a comprehensive review of existing interoperability standards and encodings has been performed by the project partners. This included aspects such as:

standards for encoding observation data
standards for describing sensor data (metadata)
Internet of Things protocols for transmitting data from sensing devices
interfaces for data access

From a technical perspective, the evaluation has especially considered developments such as the OGC API family of standards, lightweight data and metadata encodings, as well as developments coming from the Internet of Things community. This has been complemented by an investigation of relevant vocabularies that may be used for enabling semantic interoperability through a common terminology within data sets and corresponding metadata.

Furthermore, specific consideration was given to the description of different properties that help to assess the quality of an observation data sets. This comprises not only the description of the data itself but also quality related aspects of data acquisition processes. For this purpose, the MINKE project is working on recommendations how to enhance the analysed (meta) data models and encodings with further elements to better transport critical information for better interpreting data sources with regard to the accuracy, uncertainty and re-usability.

Within our contribution we will present the current state of the work within the MINKE project, the results achieved so far and the practical implementations that are performed in cooperation with the project partners.

How to cite: Jirka, S., Autermann, C., Del Rio Fernandez, J., Konkol, M., and Martínez, E.: Harmonising the sharing of marine observation data considering data quality information, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9291, https://doi.org/10.5194/egusphere-egu23-9291, 2023.

EGU23-10223 | Orals | ESSI2.2

Identifying and Describing Billions of Objects: an Architecture to Tackle the Challenges of Volume, Variety, and Variability

Jens Klump, Doug Fils, Anusuriya Devaraju, Sarah Ramdeen, Jesse Robertson, Lesley Wyborn, and Kerstin Lehnert

Persistent identifiers are applied to an ever-increasing diversity of research objects, including data, software, samples, models, people, instruments, grants, and projects. There is a growing need to apply identifiers at a finer and finer granularity. The systems developed over two decades ago to manage identifiers and the metadata describing the identified objects struggle with this increase in scale. Communities working with physical samples have grappled with these challenges of the increasing volume, variety, and variability of identified objects for many years. To address this dual challenge, the IGSN 2040 project explored how metadata and catalogues for physical samples could be shared at the scale of billions of samples across an ever-growing variety of users and disciplines. This presentation outlines how identifiers and their describing metadata can be scaled to billions of objects. In addition, it analyses who the actors involved with this system are and what their requirements are. This analysis resulted in the definition of a minimum viable product and the design of an architecture that addresses the challenges of increasing volume and variety. The system is also easy to implement because it reuses commonly used Web components. Our solution is based on a Web architectural model that utilises Schema.org, JSON-LD and sitemaps. Applying these commonly used architectural patterns on the internet allows us not only to handle increasing volume, variety and variability but also enable better compliance with the FAIR Guiding Principles.

How to cite: Klump, J., Fils, D., Devaraju, A., Ramdeen, S., Robertson, J., Wyborn, L., and Lehnert, K.: Identifying and Describing Billions of Objects: an Architecture to Tackle the Challenges of Volume, Variety, and Variability, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10223, https://doi.org/10.5194/egusphere-egu23-10223, 2023.

EGU23-13331 | Posters on site | ESSI2.2

Data challenges and opportunities from nascent kilometre-scale simulations

Valentine Anantharaj, Samuel Hatfield, Inna Polichtchouk, and Nils Wedi

Computational experiments using earth system models, approaching kilometre-scale (k-scale) horizontal resolutions, are becoming increasingly common across modeling centers. Recent advances in high performance computing systems, along with efficient parallel algorithms that are capable of leveraging accelerator hardware, have made k-scale models affordable for specific purposes. Surrogate models developed using machine learning methods also promise to further reduce the computational cost while enhancing model fidelity. The “avalanche of data from k-scale models” (Slingo et al., 2022) has also posed new challenges in processing, managing, and provisioning data to the broader user community.

During recent years, a joint effort between the European Center for Medium-Range Weather Forecasts (ECMWF) and the Oak Ridge National Laboratory (ORNL) has succeeded in simulating “a baseline for weather and climate simulations at 1-km resolution,” (Wedi et al., 2020) using the Summit supercomputer at the Oak Ridge Leadership Facility (OLCF). The ECMWF hydrostatic Integrated Forecasting System (IFS), with explicit deep convection on an average grid spacing of 1.4 km, was used to perform a set of experimental nature runs (XNR) spanning two seasons corresponding to a northern hemispheric winter (NDJF), and August - October (ASO) months corresponding the tropical cyclone season in the North Atlantic.

We developed a bespoke workflow to process and archive over 2 PB of data from the 1-km XNR simulations (XNR1K). Further, we have also facilitated access to the XNR1K data via an open science data hackathon. The hackathon projects also have access to a data analytics cluster to further process and analyze the data. The OLCF data center supports high speed data sharing via globus data transfer mechanism. External users are using the XNR1K data for a number of ongoing research projects, including observing system simulation experiments, designing satellite instruments for severe storms, developing surrogate models, understanding atmospheric processes, and generating high-fidelity visualizations.

During our presentation we will share our challenges, experiences and lessons learned related to the processing, provisioning and management of the large volume of data, and the stakeholder engagement and logistics of the open science data hackathon.

Slingo, J., Bates, P., Bauer, P. et al. (2022) Ambitious partnership needed for reliable climate prediction. Nat. Clim. Chang. https://doi.org/10.1038/s41558-022-01384-8

Wedi, N., Polichtchouk, I., et al. (2020) A Baseline for Global Weather and Climate Simulations at 1 km Resolution, JAMES. https://doi.org/10.1029/2020MS002192

How to cite: Anantharaj, V., Hatfield, S., Polichtchouk, I., and Wedi, N.: Data challenges and opportunities from nascent kilometre-scale simulations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13331, https://doi.org/10.5194/egusphere-egu23-13331, 2023.

EGU23-13501 | Posters on site | ESSI2.2

The brokering approach empowering the WMO data space for hydrology

Enrico Boldrini, Paolo Mazzetti, Fabrizio Papeschi, Roberto Roncella, Massimiliano Olivieri, Washington Otieno, Igor Chernov, Silvano Pecora, and Stefano Nativi

WMO is coordinating the efforts to build a data space for hydrology, called the WMO Hydrological Observing System (WHOS).
Hydrological datasets have intrinsic value and are worth the enormous human, technological and financial resources required to collect them over long periods of time. Their value is maximized when data is open, of quality, discoverable, accessible, interoperable, standardized, and addressing user needs, enabling various sector users to use and reuse the data. It is essential that hydrological data management and exchange is implemented effectively to maximize the benefits of data collection and optimize reuse.
WHOS provides a service-oriented framework that connects data providers to data consumers. It realizes a system of systems that provides registry, discovery, and access capabilities to hydrology data at different levels (local, basin, regional, global). In 2015, the World Meteorological Congress supported the full implementation of WHOS, which is currently publicly available at https://community.wmo.int/activity-areas/wmo-hydrological-observing-system-whos, along with information for both end users and data providers about how to use and join it.
End users (such as hydrologists, forecasters, decision makers, general public, academia) can discover, access, download and further process hydrological data available through WHOS portal by means of their preferred clients (web applications, tools and libraries).
Data providers (such as National Meteorological and Hydrological Services - NMHSs, river basin authorities, private companies, academia) can share their data through WHOS by publishing it online by means of machine-to-machine web services.
The brokering approach powered by the Discovery and Access Broker (DAB) technology enables the interoperability between data providers’ services and end users’ clients. A mediation layer implemented by the DAB brokering framework mediates between the different standard protocols and data models used by both providers and consumers to seamlessly enable the data flow from heterogeneous data providers to the clients of each end user.
In parallel, WHOS experts are working in constant collaboration with the data providers to support the implementation of the latest standards required by the international guidelines (e.g., WaterML2.0 and WIGOS Metadata Standard), optimize the data publication and improve the metadata and data quality.
The WHOS Distance Learning course has been successfully conducted; attenders from NMHSs were provided updated information and guidelines to optimize their hydrological data sharing. The course is currently being translated into Spanish to carry out it for Spanish speaking countries in 2023.
WHOS is a hydrological component of WMO Information System (WIS), which is currently in its pilot phase. WHOS and WIS Interoperability tests are currently being piloted and expected to end in 2023. The aim of this interoperability is to promote smooth data exchange between Hydrology community and the wider WMO community. Finally, hydrological data shared through WHOS will be accessible to general WIS users (all piloted programmes, including climate through OpenCDMS, and cryoshere) and at the same time WHOS users will make use of observations made available by WIS.

How to cite: Boldrini, E., Mazzetti, P., Papeschi, F., Roncella, R., Olivieri, M., Otieno, W., Chernov, I., Pecora, S., and Nativi, S.: The brokering approach empowering the WMO data space for hydrology, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13501, https://doi.org/10.5194/egusphere-egu23-13501, 2023.

EGU23-14237 | Orals | ESSI2.2

Environmental data value stream as traceable linked data - Iliad Digital Twin of the Ocean case

Piotr Zaborowski, Rob Atkinson, Alejandro Villar Fernandez, Raul Palma, Ute Brönner, Arne Berre, Bente Lilja Bye, Tom Redd, and Marie-Françoise Voidrot

In the distributed heterogeneous environmental data ecosystems, the number of data sources, volume and variances of derivatives, purposes, formats, and replicas are increasingly growing. In theory, this can enrich the information system as a whole, enabling new data value to be revealed via the combination and fusion of several data sources and data types, searching for further relevant information hidden behind the variety of expressions, formats, replicas, and unknown reliability. It is now visible how complex data alignment is, and even more, it is not always justified due to capacity and business issues. One of the challenging, but also most rewarding approaches is semantic alignment, which promises to fill the information gap of data discovery and joins. To formalise one, an inevitable enabler is an aligned, linked, and machine readable data model enabling the specification of relations between data elements generated information. The Iliad - digital twins of the ocean are cases of this kind, where in-situ data and citizen science observations are mixed with multidimensional environmental data to enable data science and what-if models implementation and to be integrated into even broader ecosystems like the European Digital Twin Ocean (EDITO) and European Data Spaces. An Ocean Information Model (OIM) that will enable traversals and profiles is the semantic backbone of the ecosystem. Defined as the multi-level ontology, it will explain data using well known generic (Darwin Core, WoT), spatio-temporal (SOSA/SSN, OGC Geo, W3C Time, QUDT, W3C RDF Data Cube, WoT) and domain (WORMS, AGROVOC) ontologies. Machine readability and unambiguity allow for both automated validation and some translations.
On the other hand, efficient use of this requires yet another skill in data management and development besides GIS, ICT and domain expertise. In addition, as the semantics used in the data and metadata have not yet been stabilised on the implementation level, it introduces a few more flexibilities of data expression. Following the GEO data sharing and data management principles along with FAIR, CARE and TRUST, the environmental data is prepared for harmonisation. Furthermore, to ease the entry and to harmonise conventions, the authors introduce a multi-touchpoint data value chain API suite with an aligned approach to semantically enrich, entail and validate data sets such as observations streams in JSON or JSON-LD based on OIM, through storage and scientific data in NetCDF to exposing this semantically aligned data via the newly endorsed and already successful OGC Environmental Data Retrieval API. The practical approach is supported by a ready-to-use toolbox of components that presents portable tools to build and validate multi-source geospatial data integrations keeping track of the information added during mesh-up and predictions and what-if implementations.

How to cite: Zaborowski, P., Atkinson, R., Villar Fernandez, A., Palma, R., Brönner, U., Berre, A., Bye, B. L., Redd, T., and Voidrot, M.-F.: Environmental data value stream as traceable linked data - Iliad Digital Twin of the Ocean case, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14237, https://doi.org/10.5194/egusphere-egu23-14237, 2023.

EGU23-15907 | Posters on site | ESSI2.2

GSSC Now - ESA's Thematic Exploitation Platform for Navigation Science Data

Vicente Navarro, Sara del Rio, Emilio Fraile, Luis Mendes, and Javier Ventura

Nowadays, the sheer amount of data collected from space-borne and ground-based sensors, is changing past approaches towards data processing and storage. In the Information Technology domain, the rapid growth of data generation rates, expected to produce 175 zettabytes worldwide by 2025, is changing approaches to data processing and storage dramatically. This landscape has led to a new golden age of Machine learning (ML), able to extract knowledge and discover patterns between input and output variables given the sheer volume of available training data.

In space, over 120 satellites from four Global Navigation Satellite Systems (GNSS), including Galileo, will provide, already this decade, continuous, worlwide signals in several frequencies. On ground, the professional market represented by thousands of permanent GNSS stations has been complemented by billions of mass-market receivers integrated in smartphones and Internet-of-Things (IoT) devices.

Along their travel down to Earth through the atmosphere, multiple sources alter the precisely modulated GNSS signals. As they pass through irregular plasma patches in the ionosphere, GNSS signals undergo delay and fading, formally known as 'scintillation'. Further down, they are modified by the amount of water vapor in the troposphere. These alterations, recorded by GNSS receivers as digital footprints in massive streams of data, represent a valuable resource for science, increasingly employed to study Earth’s atmosphere, oceans, and surface environments.

In order to realize the scientific potential of GNSS data, at the European Space Astronomy Centre (ESAC) near Madrid, the GNSS Science Support Centre (GSSC) led by ESA’s Navigation Science Office, hosts ESA’s data archive for scientific exploitation of GNSS data.

Analysis of Global Navigation Satellite Systems (GNSS) data has traditionally pivoted around the idea of datasets search and download from multiple repositories that act as data-hubs for different types of GNSS resources generated worldwide. In this work we introduce an innovative GNSS Thematic Exploitation Platform, GSSC Now, which expands a GNSS-centric data lake with novel capabilities for discovery and high-performance-computing.

We explain how this platform performs GNSS data fusion from multiple data sources, enabling the deployment of Machine Learning (ML) processors to unleash synergies across science domains.

Finally, through the presentation of several GNSS science use cases, we discuss the implementation of GSSC Now’s cyber-infrastructure, current status, and future plans to accelerate the development of innovative applications and citizen-science.

How to cite: Navarro, V., del Rio, S., Fraile, E., Mendes, L., and Ventura, J.: GSSC Now - ESA's Thematic Exploitation Platform for Navigation Science Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15907, https://doi.org/10.5194/egusphere-egu23-15907, 2023.

EGU23-16662 | Posters on site | ESSI2.2

NLP-based Cognitive Search Engine for the GEOSS Platform data

Yannis Kopsinis, Zisis Flokas, Pantelis Mitropoulos, Christos Petrou, Thodoris Siozos, and Giorgos Siokas

Effectively querying unstructured text information in large databases is a highly demanding task. Conventional approaches, such as an exact match or fuzzy search, return valid and thorough results only when the user query adequately matches the wording within the text or the query is included in keyword-tag lists. The GEOSS portal relies on conventional search tools for data and services exploration and retrieval, limiting its capacity. This challenge, recent advances in Artificial Intelligence (AI)-based Natural Language Processing (NLP) try to surpass with enhanced information retrieval and cognitive search. Rather than relying on exact or fuzzy text matching, it detects documents that semantically and conceptually are close enough to the search query.

The EIFFEL EU-funded project aims to reveal the role of GEOSS as the default Digital Portal for building Climate Change (CC) adaption and mitigation applications and offer the Earth Observation community the ground-breaking capacity of exploiting existing GEOSS datasets. To this end, as a lead technological partner of the EIFFEL consortium, LIBRA AI Technologies, designs and develops an end-to-end advanced cognitive search system dedicated to the GEOSS Portal and exceeds current challenges.

The proposed system comprises an AI language model optimized for CC-related text and queries, a framework for collecting a sizeable CC-specific corpus used for the language model specialization, a back-end that adopts modern database technologies with advanced capabilities for embedding-based cognitive search matching, and an open Application Programming Interface (API). The cognitive search component is the backbone of the EIFFEL visualisation engine, which will allow any GEOSS user, as well as the EIFFEL Climate Change application developing teams, to detect GEOSS data objects and services that are of interest for their research and application but could not effectively get accessed with the available GEOSS Portal search engine.

The work described in this abstract is part of the EIFFEL European project. The EIFFEL project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 101003518. We thank all partners for their valuable contributions.

How to cite: Kopsinis, Y., Flokas, Z., Mitropoulos, P., Petrou, C., Siozos, T., and Siokas, G.: NLP-based Cognitive Search Engine for the GEOSS Platform data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16662, https://doi.org/10.5194/egusphere-egu23-16662, 2023.

EGU23-16795 | Posters on site | ESSI2.2

Harmonizing Diverse Geo-Spatiotemporal Data for Event Analytics

Michael Rilee, Kwo-Sen Kuo, Michael Bauer, Niklas Griessbaum, and Dai-Hai Ton-That

Parallelization is the only means by which it is possible to process large amounts of diverse data on reasonably short time scales. However, while parallelization is necessary for performant and scalable BigData analysis, it is insufficient. We observe that we most often require spatiotemporal coincidence (i.e., at the same space and time) in geo-spatiotemporal analyses that integrate diverse datasets. Therefore, for parallelization, these large volumes of diverse data must be partitioned and distributed to cluster nodes with spatiotemporal colocation to avoid data movement among the nodes necessitated by misalignment. Such data movement devastates scalability.

The prevalent data structure for most geospatial data, e.g., simulation model output and remote sensing data products, is the (Raster) Array, with accompanying geolocation arrays, i.e., longitude-latitude, of the same shape and size establishing, through the array index, a correspondence between a data array element and its geolocation. However, this array-index-to-geolocation relation is ever-changing from dataset to dataset and even within a dataset (e.g., swath data from LEO satellites). Consequently, it is impossible to use array indices for partitioning and distribution to achieve consistent spatiotemporal colocation.

A simplistic way to address this diversity is through homogenization, i.e., resampling (aka re-gridding) all data involved onto the same grid to establish a fixed array-index-to-geolocation relation. Indeed, this crude approach has become the existing common practice. However, different applications have different requirements for resampling, influencing the choice of the interpolation algorithm (e.g., linear, spline, flux-conserved, etc.). Regardless of which algorithm is applied, large amounts of modified and redundant data are created, which not only exacerbates the BigData Volume challenge but also obfuscates the processing and data provenance.

SpatioTemporal Adaptive-Resolution Encoding, STARE, was invented to address the scalability challenge through data harmonization, allowing efficient spatiotemporal colocation of the “native data” without re-gridding. STARE (1) ties its indices directly to space-time coordinate locations, unlike raster array indices used in the current practice which must go indirectly through the floating-point longitude-latitude arrays to reference geolocation, and (2) embeds neighborhood information in the indices to enable highly performant numerical operations for “joins” such as intersect, union, difference, and complement. These two properties together give STARE its exceptional data-harmonizing power because, when given a pair of STARE indices are associated with a data element, we know not only its spatiotemporal location but also its neighborhood, i.e., the spatiotemporal volume (2D in space plus 1D in time) that the data element represents.

These capabilities of STARE-based technologies allow not only the harmonization of diverse datasets but also sophisticated event analytics. In this presentation, we will discuss the application of STARE to the integrative analysis of Extra-Tropical Cyclones and precipitation events, wherein we use STARE to identify and catalog co-occurrences of these two kinds of events so that we may study their relationships using diverse data of the best spatiotemporal resolution available.

How to cite: Rilee, M., Kuo, K.-S., Bauer, M., Griessbaum, N., and Ton-That, D.-H.: Harmonizing Diverse Geo-Spatiotemporal Data for Event Analytics, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16795, https://doi.org/10.5194/egusphere-egu23-16795, 2023.

EGU23-2780 | ECS | Orals | ESSI2.5

Community Building for Data Sharing and Open Science within the Earth, Space, and Environmental Sciences

Kristina Vrouwenvelder and Shelley Stall

Open Science is transformative, removing barriers to sharing science and increasing reproducibility and transparency. The benefits of Open Science are maximized when its principles are incorporated throughout the research process, through working collaboratively with community members and sharing data, software, workflows, samples, and other aspects of scientific research as FAIRly and openly as possible. However, the paths toward Open Science are not always apparent. Developing Open Science skills is an ongoing practice, and while these skills enhance outcomes for individual researchers as well as the broader community, there are many concepts, approaches, and tools to learn along the way.

How can we break down the barriers confronting researchers in their Open Science journey? How can we develop and support necessary infrastructure to reuse, distribute, and reproduce the outputs of scientific research? How do we create a culture where having better tools, practices, and methods helps us achieve this goal?

We will share work by AGU, our collaborators, and the broader community to support researchers in the Open Science journey, build groups to share resources, leading practices, and experiences, and help develop networks of support across the Earth, space, and environmental science community at all levels, to better support the culture of the future.

How to cite: Vrouwenvelder, K. and Stall, S.: Community Building for Data Sharing and Open Science within the Earth, Space, and Environmental Sciences, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2780, https://doi.org/10.5194/egusphere-egu23-2780, 2023.

EGU23-3269 | ECS | Orals | ESSI2.5

HCDC datasearch portal: Replacing legacy solutions with a unified open-source portal

Linda Baldewein, Housam Dibeh, Philipp S. Sommer, and Ulrike Kleeberg

In Earth System Sciences, new data portals are currently being developed by what seems to be each new project and research initiative. But what happens to already existing solutions that are in a dire need of a software update? We will introduce the HCDC datasearch portal (https://hcdc.hereon.de/datasearch/), an open-source software solution, that combines data from a legacy database, file storage systems, OGC conform web services and a World Data Center. Our portal provides a common interface for all our heterogeneous data-sources to select and to download the data-products based on filters for metadata and spatio-temporal information.

Three legacy portal solutions at Helmholtz-Zentrum Hereon are replaced by a scalable and easily extendable new portal based on an Elasticsearch cluster in the back-end and a user-friendly web interface as well as a machine readable API in the front-end. To ensure software that fits the user’s workflows, a stakeholder group was involved from the early stages of the planning up until the release of the final product.

Extensibility of the portal is ensured by only storing metadata within the portal. Data access and download is configured based on each decentralized storage solution, e.g. a local database or a World Data Center. Harmonization of metadata is crucial for the user experience of the portal. We limited the searchable metadata to 14 fields in addition to geospatial and temporal metadata, including information such as the platform from which the data originates and the parameter that was measured. Whenever possible, controlled vocabularies were used. Due to the heterogeneity of the data, including climate model results as well as long-tail biogeochemical campaign data, this is an ongoing process.

The HCDC datasearch portal provides an example of the challenges and opportunities of combining data from distributed data sources through a single entry-point based on state-of-the-art web technologies. It can be used to discuss the challenges of re-using legacy solutions in a continually progressing research data infrastructure world.

How to cite: Baldewein, L., Dibeh, H., Sommer, P. S., and Kleeberg, U.: HCDC datasearch portal: Replacing legacy solutions with a unified open-source portal, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3269, https://doi.org/10.5194/egusphere-egu23-3269, 2023.

EGU23-3511 | Orals | ESSI2.5

The Value of Legacy data - securing access and reuse of 25 years of data

Adrian Clark and Kurt Hansen

Researchers need access to infrastructures to make their research data findable, accessible, interoperable
and reusable, known as FAIR. The FAIR principles of data sharing transcend disciplines and encapsulate
the aims and ideals of Open Data advocates.
As researchers look to make their research data FAIR, in some disciplines this may result in data being
shared in disparate solutions. If we look at the Geosciences for example, a researcher may use GitLab to
share their code, a hosted website for discoverability and separate storage for the preservation of
legacy data. Research groups often possess a rich heritage of data spanning
periods of several decades. However, this legacy almost never gets proper attention due to lack
of funding, and thus lack of long-term maintenance plans. As a consequence, the legacy data are
typically unreadable and inaccessible due to obsolete formats and technologies in which they are
provided.
This presentation will demonstrate how 25 years of wind data at the Technical University of Denmark
(DTU) has been shared in accordance with the FAIR principles in their Figshare powered repository.
We’ll demonstrate how the long term availability of the data is secured and how discoverability is ensured
and data reuse is encouraged with robust metadata. This presentation will also touch on the importance of
data reuse throughout the research lifecycle and showcase this for both an academic and lay audience
through the features of the DTU Data repository.

How to cite: Clark, A. and Hansen, K.: The Value of Legacy data - securing access and reuse of 25 years of data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3511, https://doi.org/10.5194/egusphere-egu23-3511, 2023.

EGU23-3686 | Posters on site | ESSI2.5

An Introduction to NASA’s Catalog of Archived Suborbital Earth Science Investigations (CASEI)

Stephanie Wingo, Deborah Smith, Carson Davis, Shelby Bagwell, Heidi Mok, Edward Keeble, Tammo Feldmann, Anthony Lukach, Alice Ruehl, Camille Woods, Ashlyn Shirey, Elijah Walker, and Rahul Ramachandran

NASA’s Airborne Data Management Group (ADMG) works to promote and ensure the discoverability and accessibility of over 50 years of non-satellite Earth science observations. This includes leading the development of the Catalog of Archived Suborbital Earth Science Investigations (CASEI), which provides a single entry point to efficiently search across all of NASA’s airborne and field data holdings. CASEI supports NASA’s Open Source Science Initiative vision by providing holistic, descriptive contextual metadata and links to streamline access to suborbital data products, regardless of which repository is responsible for their stewardship.

Metadata in CASEI includes descriptive contextual details that are typically arduous to locate amid a synthesis of scattered publications, project and program websites, and disparate data discovery tools. These metadata include motivating science objectives, key events/time periods in observational records, complementary simultaneous observations, and programmatic details, among others. Diversity of data formats and science disciplines served by CASEI necessitate a common data model to organize suborbital observation metadata and appropriately represent the relationships among campaigns, platforms, and instruments.

This presentation will describe the development of the CASEI system: well-defined data models to drive a cloud-based user data access portal, simultaneously provisioned interfaces enabling synchronous metadata updates, the curation process required to sustain this unique inventory of airborne and field metadata, management of CASEI information content, and connecting end users to data products relevant for their interests - regardless of which NASA distributed archive center holds the data.

Particular attention will be granted to how CASEI facilitates discovery and (re)use of these lesser-known NASA data, supporting FAIR principles and Open Science to enhance the return on investments made in these unique and varied observations. An up-to-date summary of CASEI inventory content, avenues for CASEI enhancements, and potential improvements in suborbital data stewardship at various stages of the data life cycle will also be discussed.

How to cite: Wingo, S., Smith, D., Davis, C., Bagwell, S., Mok, H., Keeble, E., Feldmann, T., Lukach, A., Ruehl, A., Woods, C., Shirey, A., Walker, E., and Ramachandran, R.: An Introduction to NASA’s Catalog of Archived Suborbital Earth Science Investigations (CASEI), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3686, https://doi.org/10.5194/egusphere-egu23-3686, 2023.

EGU23-6311 | ECS | Posters on site | ESSI2.5

IPFS Pinning Service for Open Climate Research Data

Marco Kulüke, Stephan Kindermann, and Tobias Kölling

The InterPlanetary File System (IPFS) is a novel decentralized file storage network that allows users to store and share files in a distributed manner, which can make it more resilient if individual infrastructure components fail. It also allows for faster access to content as users can get files directly from other users instead of having to go through a central server. However, one of the challenges of using IPFS is ensuring that the files remain available over time. This is where an IPFS pinning service offers a solution. An IPFS pinning service is a type of service that allows users to store and maintain the availability of their files on the IPFS network. The goal of an IPFS pinning service is to provide a reliable and trusted way for users to ensure that their files remain accessible on the IPFS network. This is accomplished by maintaining a copy of the file on the service's own storage infrastructure, which is then pinned to the IPFS network. This allows users to access the file even if the original source becomes unavailable.

We explored the use of the IPFS for scientific data with a focus on climate data. We set up an IPFS node running on a cloud instance at the German Climate Computing Center where selected scientists can pin their data and make them accessible to the public via the IPFS infrastructure. IPFS is a good choice for climate data, because the open network architecture strengthens open science efforts and enables FAIR data processing workflows. Data within the IPFS is freely accessible to scientists regardless of their location and offers fast access rates to large files. In addition, data within the IPFS is immutable, which ensures that the content of a content identifier does not change over time. Due to the recent development of the IPFS, the project outcomes are novel data science developments for the earth system science and are potentially relevant building blocks to be included in the earth system science community.

How to cite: Kulüke, M., Kindermann, S., and Kölling, T.: IPFS Pinning Service for Open Climate Research Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6311, https://doi.org/10.5194/egusphere-egu23-6311, 2023.

EGU23-6400 | Posters on site | ESSI2.5

EPOS Open Source: A platform for integrating high-quality research products and services.

Valerio Vinciarelli, Andrea Orfino, Rossana Paciello, Daniele Bailo, Claudio Goffi, Kety Giuliacci, Manuela Sbarra, Jan Michalek, Harald Nedrebø, Jean-Baptiste Roquencourt, Yann Retout, Daniel Warren, and Janusz Lavrnja-Czapski

The European Plate Observing System (EPOS) is releasing a pan-European research infrastructure (RI) for Solid-Earth sciences that targets different scientific communities. The EPOS RI enables sharing of data and resources; promoting collaboration, harmonization of practices and methods, fosters innovation and novel scientific discoveries. These principles of scientific and technological collaboration are the basis of the concepts of Open Source and Open Collaboration.

EPOS consists of essentially two components: firstly, the so-called Thematic Core Services (TCS) representing the Data providers from the scientific domains (e.g., Seismology, Satellite data etc.); secondly, the central integration node, namely the ICS-C (Integrated Core Services – Central Hub), representing the integrating ICT (Information and Communication Technology) system underpinning the EPOS Data Portal.

EPOS is currently releasing the Open-Source version of its architecture, based on microservices, which includes a GUI (Graphical User Interface) implementing the Data Portal, i.e., the human oriented interface for accessing the assets made available by the TCS. It communicates with the ICS-C system by means of RESTful APIs which also implement AAAI (authentication, authorization, accounting infrastructure). Through the APIs, the Data Portal queries the metadata catalog to discover and contextualize assets of interest provided by the TCS and documented as metadata.

In order to integrate heterogeneous datasets from the TCS, appropriate metadata and semantic descriptions are used to drive interactions with TCS resources or to construct a workflow to be executed across TCS.

EPOS Open Source also includes a microservice to enable the interaction with large-scale computing resources or geoscience software services, represented in EPOS as ICS-D (Integrated Core Services – Distributed). The different processing done through the ICS-D on the TCS data are also metadata driven, the software executed on the ICS-D, which enable additional features and functionalities to the ICS-C core, have their own metadata description and through a plugin architecture run on the ICS-D.

The architecture is designed to integrate with e-Infrastructures such as GRID or CLOUD facilities and particularly ongoing work includes achieving interoperability with EOSC (European Open Science Cloud) by means of FAIR web services. The EPOS architecture has also been used as a template in other initiatives such as other Environmental Science RIs (e.g., ENVRI Catalogue of Services) and Jerico.

In the presentation we will describe the work done so far and the key concept that brought to the adoption of a microservice based, open-source released architecture, and provide perspectives for future extension of the project.

How to cite: Vinciarelli, V., Orfino, A., Paciello, R., Bailo, D., Goffi, C., Giuliacci, K., Sbarra, M., Michalek, J., Nedrebø, H., Roquencourt, J.-B., Retout, Y., Warren, D., and Lavrnja-Czapski, J.: EPOS Open Source: A platform for integrating high-quality research products and services., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6400, https://doi.org/10.5194/egusphere-egu23-6400, 2023.

EGU23-7455 | ECS | Posters on site | ESSI2.5

An Automated Data Ingestion Workflow for the TOAR Database

Enxhi Kreshpa, Sabine Schröder, Niklas Selke, and Martin Schultz

Over the last years, several repositories with curated environmental datasets have been created so that scientific communities have gained access to large collections of data from various domains. The level of data harmonisation and FAIRness, technical readiness and scalability of these repositories differs substantially. This restricts data exploration opportunities and limits scientific exploration with modern data science methods, such as machine learning. In the domain of air quality research, we have pioneered a data infrastructure for global observations of surface ozone and other air pollutant measurements that comes with rich possibilities for online data analysis. The data in the Tropospheric Ozone Assessment Report (TOAR) database is collected from about 40 different resource providers, from national and international environmental agencies to individual research groups around the world.
One of these data providers is OpenAQ, the world's first open, real-time air quality platform. Due to the higher standards of curation, the need for data harmonization, and the enriched metadata in the TOAR database, we had to develop an automated workflow to transport archived and real-time data from this provider to the TOAR database. The primary step is to clean and format all the OpenAQ records, according to the TOAR database schema, and concurrently, refine the metadata. The workflow includes tests for data sanity and checks if time series and station metadata can be amended, or whether new time series or station records must be created. The automation manager triggers the workflow hourly, so the database provides clean and updated air quality data at any time.
The presentation describes the automated workflow and its design principles and discusses how such a workflow might be re-used in other environmental domains. All TOAR-related codes are open source.

How to cite: Kreshpa, E., Schröder, S., Selke, N., and Schultz, M.: An Automated Data Ingestion Workflow for the TOAR Database, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7455, https://doi.org/10.5194/egusphere-egu23-7455, 2023.

EGU23-7968 | Orals | ESSI2.5

Data journals induce culture change in earth sciences

David Carlson, Hans Pfeiffenberger, and Kirsten Elger

Science changes direction, practice and impact when researchers discover tangible rewards. Policy organizations, funding agencies and educational institutions might wish otherwise but long experience suggests that new incentive structures, better recognition of engagement, and cultural change emerge bottom-up, not top-down. For these reasons, recent emergence of data journals issuing valid credit for data providers accompanied by guidance and assurance for data users promote rapid positive change in data sharing and impact. In Earth System Science Data (ESSD), a ‘new’ (since 2009) Copernicus journal (and, likewise, in Scientific Data by Springer/Nature since 2014 and in a few other recent journals), authors experience widespread, often unanticipated, impact of their data through use, and re-use and citation. Meanwhile, journalists discover, and appreciate, reliability and utility of data publications, particularly for climate, biodiversity or public health data products that update on regular (e.g. annual) bases. With care and cooperation, ESSD publications on topics such as agricultural or woodfire emissions, population, global carbon, methane or energy budgets, or regional pipeline capabilities: a) feed and support ‘front-page’ articles in BBC, Washington Post and other nationally- and internationally-prominent news sources; b) develop useful options for essential planetary monitoring (e.g. as components incorporated into UNFCCC’s proposed Global Stocktake); and c) demonstrate science - via normal steps of scrutiny and revision - engaged with urgent social issues. Through familiar but innovative mechanisms researchers gain validation and certification of data (for citation credit!), ensure wide re-use throughout broad research communities, and often achieve substantial public impact. These new mechanisms signal a positive change in the culture of our science.

How to cite: Carlson, D., Pfeiffenberger, H., and Elger, K.: Data journals induce culture change in earth sciences, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7968, https://doi.org/10.5194/egusphere-egu23-7968, 2023.

EGU23-8353 | Posters on site | ESSI2.5

Towards a harmonized data ecosystem in Earth and Environment– a view on the Helmholtz Association’s Data Infrastructures

Emanuel Soeding, Andrea Poersch, Martin Weinelt, Pier Luigi Buttigieg, Helen Kollai, Sören Lorenz, and Yousef Razeghi

The interconnectivity of existing data infrastructures (DIS) across national and international initiatives (e.g. NFDI, EOSC and others) is an important goal to create a common interoperable data ecosystem. To achieve this, it is critical to harmonize the existing methods and concepts of research data collection, research data-reuse, among the DIS and along the FAIR principles.

Within the Helmholtz Association we maintain more than 50 data infrastructures in the field of Earth and Environment. Procedures of data handling, documentation and storage are hardly coordinated within Helmholtz, even less so within the larger community. To find out about the state of our infrastructures, the different approaches in data management procedures, technical capabilities, and concepts, we conducted a survey among all Helmholtz DIS. We asked questions related to their roles in the community, self-perception, quality control, curation, technology interfaces, data re-use and demands.

Based on this data we developed our vision to create a “Helmholtz data space”, unifying Earth and Environmental Centres and infrastructures and powering a new wave of large-scale, globally oriented, data driven research. The Helmholtz Metadata Collaboration’s (HMC) mission is, to federate (meta)data systems across Earth and Environment Centres and infrastructures throughout the Helmholtz Association, continuously aligning Helmholtz capacities to global norms and developments.

How to cite: Soeding, E., Poersch, A., Weinelt, M., Buttigieg, P. L., Kollai, H., Lorenz, S., and Razeghi, Y.: Towards a harmonized data ecosystem in Earth and Environment– a view on the Helmholtz Association’s Data Infrastructures, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8353, https://doi.org/10.5194/egusphere-egu23-8353, 2023.

EGU23-10177 | Orals | ESSI2.5

The Origins of NASA’s Visualization, Exploration and Data Analytics (VEDA) Platform: A platform for biomass research, NASA’s Earth Information System and the COVID-19 Dashboard

Aimee Barciauskas, Manil Maskey, Doug Newman, Slesa Adhikari, Olaf Veerman, Leo Thomas, Alexandra Kirk, Kaylin Bugbee, Brian Freitag, Alexey Shiklomanov, Alex Mandel, Hook Hua, and George Chang

NASA's Visualization, Exploration, and Data Analysis (VEDA) project is an open-source science cyberinfrastructure for data processing, visualization, exploration, and geographic information systems (GIS) capabilities (https://www.earthdata.nasa.gov/esds/veda). VEDA was an ambitious platform and one that was only made possible in the past year by building upon existing NASA projects. The extensive technology community at NASA continues to come together to design, build and use VEDA’s interoperable APIs and datasets.

This presentation will demo the current capabilities of VEDA and discuss how these capabilities were designed and architected with the central goals of science delivery, reproducible science, and interoperability to support re-use of data and APIs across NASA’s Earth Science ecosystem of tools. The presentation will close with VEDA’s future plans. In 2023, VEDA will support NASA’s Transform to Open Science (TOPS) program and open-source science initiatives through data, APIs and analytics platforms. In 2023 and beyond, VEDA will advance the state of the art in cloud-based Earth science as well as strengthening the ties of technology within NASA.

The projects behind VEDA’s current features are:

The Multi-Mission Algorithm and Analysis Platform (https://maap-project.org/, presented at EGU 2019): Recognizing the numerous advantages of open, reproducible science, NASA and ESA are working together to create the Joint ESA-NASA MAAP. The MAAP brings together relevant data and algorithms in a common virtual environment in order to support the global aboveground terrestrial carbon dynamics research community.
The COVID-19 Earth Observation Dashboard (https://www.earthdata.nasa.gov/covid19/): Following the interest in this dashboard, NASA invested in the design and development of a new dashboard infrastructure. This infrastructure is highly configurable to support easily adding new datasets and discoveries. UI and config layers are built upon the VEDA STAC catalog and Cloud-Optimized GeoTIFFs.
The Earthdata Information Systems (EIS) pilots (https://eis.smce.nasa.gov/): Scientists at NASA worked together on open science tools to develop new research projects using Earth Observation data across the domains of fire, freshwater, greenhouse gasses, and sea level rise.
ArcGIS Enterprise in the Cloud (gis.earthdata.nasa.gov) provides GIS capabilities.

The projects listed above have all made VEDA a reality in a year. The scientists from EIS are using the new dashboard infrastructure to tell their stories and the analytics backend from MAAP to scale their science.

In 2023, VEDA plans many initiatives in the work to extend its reach within and beyond NASA.

There are many advanced technologies at NASA and we see an opportunity for VEDA to support closing the information gaps across groups. For example, VEDA will support driving standards for using, publishing and visualizing NASA’s Earthdata Zarr archives and also deliver interoperable APIs for its data stores to support dynamic data visualization and storytelling.

VEDA will also extend its reach beyond NASA by providing a JupyterHub for any user to explore the data behind NASA Earth Science, specifically the discoveries presented in the Earthdata Dashboard.

How to cite: Barciauskas, A., Maskey, M., Newman, D., Adhikari, S., Veerman, O., Thomas, L., Kirk, A., Bugbee, K., Freitag, B., Shiklomanov, A., Mandel, A., Hua, H., and Chang, G.: The Origins of NASA’s Visualization, Exploration and Data Analytics (VEDA) Platform: A platform for biomass research, NASA’s Earth Information System and the COVID-19 Dashboard, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10177, https://doi.org/10.5194/egusphere-egu23-10177, 2023.

EGU23-11116 | Posters on site | ESSI2.5

Leveraging IGSN to Enhance Data Management in Research Institutions

Doris Maicher, Thorge Petersen, Hela Mehrtens, and Pina Springer

The International Generic Sample Number (IGSN) is a crucial tool for ensuring the traceability and preservation of physical specimens in the Earth Science community. As a persistent identifier (PID), IGSN serves as a link between published digital data and the physical samples stored in a repository, enabling the creation of synergies with other services through the harvesting of machine-readable data.

The IGSN can be assigned to a physical specimen at the time of collection, either on board a research vessel or during a field campaign. This unique identifier will follow the sample through the various stages of processing and analysis. In this use case, we demonstrate how IGSN can be minted for sediment cores directly on board a research vessel and then subsequently linked to relevant research data infrastructures (RDIs) such as DSHIP and PANGAEA. This allows for the traceability and easy identification of the samples as they are transported and stored in different repositories.

Incorporating IGSN into a RDI helps to broadcast the existence of physical material and makes it more easily discoverable by researchers. This is especially useful for marine field work, which can be expensive and may not be accessible to all researchers. By making information about samples available as open access, researchers are able to easily locate and reuse existing material, which can be particularly beneficial for smaller research projects or research communities with limited resources. This is especially relevant in times of crisis, when access to certain regions may be restricted and there is an increased demand for the reuse of existing samples.

In our research institutions, there is close collaboration between RDI providers and sample curators to manage both the digital data and the physical objects, such as plant samples in a herbarium, rocks and sediment cores, and biological material. In this presentation, we will use our case studies to discuss the successes of sample management in relation to IGSN. In addition, we will address the challenges that we have encountered and how we are working to overcome them. Our goal is to provide reliable services to our communities with a long-term perspective, and we believe that the incorporation of IGSN into RDIs can help to foster cultural change and encourage international collaboration in the Earth Science community. In addition, the use of IGSN and RDIs can contribute to the sustainability and reproducibility of research.

How to cite: Maicher, D., Petersen, T., Mehrtens, H., and Springer, P.: Leveraging IGSN to Enhance Data Management in Research Institutions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11116, https://doi.org/10.5194/egusphere-egu23-11116, 2023.

EGU23-12254 | Posters on site | ESSI2.5

ESA DataLabs: an open science platform relevant to Heliophysics

Arnaud Masson, Bruno Merin, Vicente Navarro, Christophe Arviset, and Helen Middleton

ESA Datalabs is a collaborative scientific platform of the European Space Agency to exploit data across the ESA science directorate missions’ archives (astronomy, planetary and heliophysics). It allows you to bring your code to the data under a private account, shareable with colleagues. Large amount of data such as the GAIA multi-billion stars catalogues can be easily mounted and searchable, allowing large scale scientific investigations impossible to achieve on a regular laptop. It also provides multiple tools to access, process and visualize JWST data and was used during the recent commissioning of JWST. In other words, it handles both public and restricted access data.

In the heliophysics domain, data from a few missions are already mounted, including all public data from the Solar Orbiter mission. A few Jupyter notebooks are already available to help the community making use of the full capabilities of the ESA archives. More will be made available in the future including tools such as JHelioviewer and data mining. Interoperability is at the heart of the ESA datalabs infrastructure and connection to clouds such as the NASA heliocloud and Amazon Web Services accounts (AWS) are in progress. Developed over the past few years, ESA Datalabs is scheduled to be public to the scientific community in 2023.

How to cite: Masson, A., Merin, B., Navarro, V., Arviset, C., and Middleton, H.: ESA DataLabs: an open science platform relevant to Heliophysics, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12254, https://doi.org/10.5194/egusphere-egu23-12254, 2023.

EGU23-12272 | ECS | Posters on site | ESSI2.5

Establish Open Science practices throughout the geoscience community in Germany with the FID GEO services

Melanie Lorenz, Kirsten Elger, Inke Achterberg, and Malte Semmler

The world of geosciences is very broadly positioned on the path to Open Science. The development of a research infrastructure must therefore address the fundamental needs within each discipline. In the geosciences, the spectrum of requirements for the data alone ranges from highly standardized real-time, large data made available internationally (e.g. in seismology or geodesy) to the full spectrum of small and highly variable data from long-tail data communities with best practices of sharing data via tables printed in research articles.

FID GEO is the specialized information service for Geosciences in Germany offering publication services and consulting around the full spectrum of Open Science in the Geosciences since 2016. Funded by the German Research Foundation (DFG), FID GEO is a service of the GFZ German Research Centre for Geosciences (GFZ) and the Göttingen State and University Library (SUB). The project provides broad access to digital knowledge resources, contributing to an open information infrastructure. The service portfolio includes electronic publishing of research results in our domain repositories. GEO-LEOe-docs, the repository for texts and geological maps, is hosted at SUB and GFZ Data Services, the domain repository for research data and scientific software, is hosted at GFZ. In addition, we offer digitization services, especially for (older) journal series and reports and a broad consulting portfolio on Open Science topics. Hereby, FID GEO advocates a holistic view of the chain of scientific results – from sample to data and software to scientific articles – and promotes that the individual elements are digitally linked in the best possible way.

From the beginning on, FID GEO developed services that facilitate the shift towards Open Science by engaging the geoscience community. The FID GEO website, our newsletter and Twitter account are tools to connect us with the community. We inform the majority of the German geoscientists with regular publications in the journal “GMIT - Geowissenschaftliche Mitteilungen”, which is also being published on GEO-LEOe-docs since 2021. Over the last six years, active interactions with the community during conferences, workshops, talks and through online questionnaires, revealed that there still is a high demand for information on open science practices. Workshops and talks have proven to be very successful tools to meet the large need for discussion. They not only allow us to directly address questions or uncertainties regarding practical aspects of open science practices, but they also offer the suitable framework to prepare the information specifically for each research group. To improve our publication services and to intensify the open information culture in the geosciences, FID GEO collaborates with strategic (inter)national initiatives (like NFDI4Earth), with German geosciences societies and other library-related projects supporting the development of open research data infrastructures.

How to cite: Lorenz, M., Elger, K., Achterberg, I., and Semmler, M.: Establish Open Science practices throughout the geoscience community in Germany with the FID GEO services, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12272, https://doi.org/10.5194/egusphere-egu23-12272, 2023.

EGU23-12638 | ECS | Orals | ESSI2.5

Building a Geospatial Lab – The People, The Tools, and The Process.

Amruth Kiran and Teja Malladi

What would it take to set up a center of geospatial excellence?

Across academia and industry, there is a growing need to complement the exponential growth of Big Earth Observation (EO) data and the numerous possibilities for processing, rendering, visualization, and sharing geospatial data and its derivatives. The challenges that come with such opportunities include the identification of efficient tools, hiring and training of people at different skill levels, and the outreach activities that enable easier communication of scientific findings. These silos need to be broken into at each step of the process that helps sustain the growth of a lab in the long run.

One such example is the Geospatial Lab (GSL) at the Indian Institute for Human Settlements (IIHS), Bangalore, India. Over the years, GSL has catered to many research, practice, and capacity-building initiatives of the institute that has helped bridge the gap and bring about a cultural change in the appreciation of the geospatial sciences. Developing a robust Research Data Infrastructure (RDI) using the foundational principles of open-source Spatial Data Infrastructure (SDI) at all levels of community engagement, has proven effective in reaching the right audience and decision-makers. Standard practices within the lab such as technical documentation, internal capacity building, extensive metadata, modern spatial/computing practices, spatial data management frameworks, and the interdisciplinarity of the team have seen greater adoption across the institution as well. These factors, coupled with excellent institutional support have been at the forefront of building a scalable, inter-operable, and distributed RDI that aims to prioritize the people over the pixel.

How to cite: Kiran, A. and Malladi, T.: Building a Geospatial Lab – The People, The Tools, and The Process., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12638, https://doi.org/10.5194/egusphere-egu23-12638, 2023.

EGU23-13659 | ECS | Posters on site | ESSI2.5

User-friendly data pipeline for FAIRification of environmental sensordata

Benjamin Louisot, Christof Lorenz, Mostafa Hadizadeh, Ingo Völksch, and David Schäfer

In recent years, the requirements for data from earth system sciences have increased massively. Data from observation systems needs to be transferred into larger research data infrastructures, evaluated and flagged via well-defined quality checks, enriched with standardized metadata and finally made available to the public via standard interfaces. And in order to fulfill the FAIR principles, we have to ensure transparency and reproducibility of all these steps. Moreover, the rising demand of near-real-time (NRT) data requires the whole data pipeline to run operationally with minimal manual effort.

However, in many cases, there are still heterogenous data landscapes to be found without centralized control of data, data processing, version control and QA/QC. This is often aggravated by to inconvenient, outdated and isolated tools and software solutions.

Therefore, we develop and implement an adaptable automated pipeline, which combines the assurance of data consistency, QA/QC (Quality Assurance / Quality Control), graphically supported validation and unified persistence and publication of data. User friendliness is achieved by making the system configurable and trackable through lightweight user interfaces over the complete data lifecycle. By only using open-source software solutions and applying community standards for data formats and interfaces, a high level of sustainability and independence can be ensured.

In this presentation, we hence want to demonstrate such an end-to-end data pipeline that finally allows for the FAIRification of typical environmental sensor data.

How to cite: Louisot, B., Lorenz, C., Hadizadeh, M., Völksch, I., and Schäfer, D.: User-friendly data pipeline for FAIRification of environmental sensordata, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13659, https://doi.org/10.5194/egusphere-egu23-13659, 2023.

EGU23-14456 | Orals | ESSI2.5

Initiating Cultural Change in the German Earth System Sciences Community with a Commitment Statement

Jörg Seegert, Daniel Nüst, and Lars Bernard

Academia faces enormous challenges to realign publication and evaluation practices towards a sustainable, open, transparent, diverse, equitable, and inclusive way of conducting research. However, values, attitudes, behaviours, and habits are difficult to change. Furthermore, individuals and organisations do not act isolated but in different contexts - from local to international, from labs to academic societies, from early career researchers to university leadership. This makes cultural change very demanding. The Earth System Sciences (ESS) consortium of the German Research Data Infrastructure programme (NFDI4Earth, https://nfdi4earth.de/) takes deliberate steps towards introducing such cultural change in the form of the NFDI4Earth FAIRness and Openness Commitment (https://nfdi4earth.de/2coordinate/cultural-change). Intentionally going beyond FAIR data, the commitment is set up as a means to introduce a cultural change towards openness and to initiate a discourse on the full circle of related values and practices as well as connected social, cultural, and economic topics.

The NFDI4Earth Commitment can be endorsed by the institutions and organisations involved and related to NFDI4Earth, such as research institutions, publishers, or funders, as well as signed by individual researchers. The commitment is designed as a continuous activity and is iterative in its nature: signatories are invited annually to sign the latest version, which over time becomes more extensive or focused, based on the community discussions and the general developments, e.g., in academia as a whole.

The NFDI4Earth Commitment will create a sense of identity for all actors and becomes an instrument to demonstrate values. It provides commonly accepted and manifested documentation that the signatories strive to adhere to, e.g., best practices in ESS research data management and can be held accountable by members or partners. In its current incarnation, the NFDI4Earth commitment puts a particular focus on the need for individuals and organisations to question and reflect their own personal, institutional and their fellows behaviours and attitudes, on the broadness of scientific contributions, scientific evaluation and incentives. Furthermore, it stresses the value of sharing and collaborating and presents a positive picture of change. To emphasize the iterative nature, the NFDI4Earth commitment includes a first level or stage of commitment, with more levels to be expected in future incarnations.

In this work, we present the current version of the NFDI4Earth Commitment, the process for its creation, the steps taken and planned for community engagement, and considerations for the future development, in particular the transferability to international communities or other disciplines within the context of the NFDI. Where already clear, we present lessons learned on the creation and introduction of the commitment.

How to cite: Seegert, J., Nüst, D., and Bernard, L.: Initiating Cultural Change in the German Earth System Sciences Community with a Commitment Statement, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14456, https://doi.org/10.5194/egusphere-egu23-14456, 2023.

EGU23-14950 | Posters on site | ESSI2.5

Data Producer and Data User Needs for Airborne and Field Earth Science Research Measurements

Bruce Wilson, Megan Buzanowicz, Amanda Leon, Sara Lubkin, Leigh Sinclair, Geoffry Stano, Michele Thornton, Matthew Tisdale, Tammy Walker, Yaxing Wei, and Stephanie Wingo

This presentation summarizes information on the needs of data producers and data users gathered from the combination of a March 2022 two-day on-line workshop and the nearly 30-year history of the NASA Earth Observing System Data and Information System (EOSDIS) Distributed Active Archive Centers (DAACs). The workshop included over 100 participants, primarily from the United States, with the first day focusing on the needs of data users and the second day focusing on the needs of data producers. Based on both the workshop and the collective experience of the DAACs, both data producers and data users benefit substantially when there is an early partnership between the research project producing the data and the data archive which will publish the data. The highly heterogenous nature of airborne and field research data presents particular challenges for discovery, particularly in the context of systems that are optimized for discovery and delivery of on-orbit Earth science data. The DAACs experience also demonstrates an evolution of best practices for working with this kind of data. However, systems which have been in operation for decades often have technical debt, which can constrain the evolution of the research data infrastructure. The migration of EOSDIS into a commercial cloud environment presents several interesting opportunities for addressing the data producer and data user needs identified by the workshop and experience of the DAACs.

How to cite: Wilson, B., Buzanowicz, M., Leon, A., Lubkin, S., Sinclair, L., Stano, G., Thornton, M., Tisdale, M., Walker, T., Wei, Y., and Wingo, S.: Data Producer and Data User Needs for Airborne and Field Earth Science Research Measurements, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14950, https://doi.org/10.5194/egusphere-egu23-14950, 2023.

EGU23-15367 | ECS | Orals | ESSI2.5

CAT4KIT: A cross-institutional data catalog framework for the FAIRification of environmental research data

Christof Lorenz, Mostafa Hadizadeh, Sabine Barthlott, Romy Fösig, Uğur Çayoğlu, Robert Ulrich, and Felix Bach

A contemporary and flexible Research Data Management (RDM) framework is required to make environmental research data Findable, Accessible, Interoperable, and Reusable (FAIR) and, hence, provide the foundation for open and reproducible earth system sciences. While data-sets that accompany scientific articles are typically published via large data repositories like Pangaea or Zenodo, intermediate, day-to-day, or actively-used data (e.g., data from research projects or prototypical data) is still exchanged via simple cloud storage services and email. And while the FAIR principles require data to be openly findable and accessible, it is often only available within closed and restricted infrastructures and local file systems.

Our research project Cat4KIT hence aims to develop a cross-institutional catalog and RDM framework for the FAIRification of such day-to-day research data. This framework is comprised of four modules / services for

providing access to data on storage systems through well-defined and standardized interfaces
harvesting and transforming (meta)data into standardized formats
making (meta)data accessible to the public using well-defined and standardized catalog services and interfaces
enabling users to search, filter, and explore data from decentralized research data infrastructures.

We develop, implement and evaluate each of these four modules within an inter-institutional consortium consisting of scientists, software developers and potential end-users. This allows us to include a wide-range of research data from multi-dimensional climate model outputs to high-frequency in-situ measurements. We emphasize the application of existing open-source solutions and community standards for data interfaces (THREDDS, STA, S3), (meta)data schemes, and catalog services (Spatio-Temporal Assets Catalog - STAC) in order to ensure an easy integration of research data into the Cat4KIT-framework and a straightforward extension to further research data infrastructures.

In our presentation, we demonstrate the current status of our Cat4KIT-framework as an inter-institutional research data management and catalog platform for the FAIRification of day-to-day research data.

How to cite: Lorenz, C., Hadizadeh, M., Barthlott, S., Fösig, R., Çayoğlu, U., Ulrich, R., and Bach, F.: CAT4KIT: A cross-institutional data catalog framework for the FAIRification of environmental research data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15367, https://doi.org/10.5194/egusphere-egu23-15367, 2023.

EGU23-15439 | Orals | ESSI2.5

Community engagement and uptake: lessons learnt in EPOS, the Research Infrastructure for Solid Earth Sciences.

Daniele Bailo, Rossana Paciello, Valerio Vinciarelli, and Carmela Freda

The path from the conception of a disruptive and innovative Research Data Infrastructure (RDI) to a successful operational RDI, is influenced almost entirely by the ability to engage at equal levels, researchers, IT experts, practitioners and data providers in its usage and adoption. This is particularly true for distributed RDI.

In this work, we present the lessons learnt in EPOS (European Plate Observing System), the unique, distributed pan-European Research Infrastructure in the solid Earth domain. In EPOS a series of challenges were faced, in terms of consensus-driven choices along the technical, governance, sustainability, and scientific dimensions.

EPOS is built for promoting collaboration, and harmonization of heterogeneous datasets, practices, and methods from ten different solid Earth communities. The final goal is to foster innovation and facilitate novel scientific discoveries. EPOS is a large research infrastructure including more than 60 data and service providers from 25 European Countries, providing 250 data services, delivering more than 30 different data formats, and covering more than 800 TB of data in total, described by more than 20 different metadata standards. It was conceived back in 2002, included in the ESFRI (European Strategic Forum on Research Infrastructures) Roadmap in 2008, then implemented through three European Projects (EPOS-PP Preparatory Phase (2010-2014), EPOS-IP Implementation Phase (2015-2019), EPOS-SP Sustainability Phase (2020-2023). EPOS was granted the status of ERIC (European Research Infrastructure Consortium) in 2018 and is in its Operational Phase since January 2023.

The first lesson learned in this journey is related to the need for procedures and boards for community building and consensus establishment; this was achieved through clear governance where all key stakeholders interact and are informed through appropriate boards and committees. The second one is technical: to integrate such heterogenous datasets into a single platform (the EPOS Data Portal) a flexible architecture based on the microservices approach was adopted. The third lesson is related to the description of the datasets and services provided by the various thematic communities in EPOS, achieved through a rich metadata model that maps enough information to drive the integration occurring at the central system underpinning the EPOS Data portal. The fourth lesson is related to the legal and governance aspects: to keep communities committed, legal agreements for governance and coordination and for the thematic data provision were established; this ensures community engagement and the adoption of common criteria and principles.

Finally, the fifth lesson is related to the co-development approach. For managing decisions and consensus on key technical and scientific aspects within a community of more than 80 individuals with different roles, responsibilities and expertise, a clear process was set up. It is inspired to the shape-up methodology but reviewed for the research context, and it proved to be effective in the EPOS RI where international collaboration is needed to manage integrated data provision.

Many challenges remain open, for instance how to recognize and to encourage the careers within RDI. These indeed require specific skills, and the assumption of responsibilities within the RDI should be recognized by setting up dedicated career paths.

How to cite: Bailo, D., Paciello, R., Vinciarelli, V., and Freda, C.: Community engagement and uptake: lessons learnt in EPOS, the Research Infrastructure for Solid Earth Sciences., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15439, https://doi.org/10.5194/egusphere-egu23-15439, 2023.

EGU23-15820 | Posters on site | ESSI2.5

Design of a digital infrastructure for hydrological research

Matt Fry and Gareth Old

The UK is planning to implement a £38M (€43M) Flood and Drought Research Infrastructure to facilitate the hydrological science and innovation needed to underpin the UK’s adaptation and resilience to floods and droughts. The UKRI’s intent to invest from its infrastructure fund was published following a ~2-year scoping study that determined research community requirements through reviews of comparable infrastructures across the UK, Europe and globally, community workshops and questionnaires, and direct engagement with potential beneficiaries from research, industry and government bodies. Significantly, the scoping study identified the importance of a digital infrastructure to enable a step-change in access to hydrological monitoring data. This would complement community access to the expected physical infrastructure for monitoring all phases of the water cycle across a range of catchment types.

Key requirements for the proposed digital infrastructure, to be delivered through 2023-2028, include access to UK-wide hydrological data alongside new catchment observatory data, supporting field monitoring and innovation through open digital systems, advancing the state of the art for sensor data management, linking monitoring activities more closely with research data archives and delivering support for open science. The digital infrastructure would leverage technological developments e.g. in cloud-based virtual research environments, and be delivered alongside a significant community capacity building effort to support cultural change and enable researchers to transform ways-of-working to maximise its potential benefits.

How to cite: Fry, M. and Old, G.: Design of a digital infrastructure for hydrological research, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15820, https://doi.org/10.5194/egusphere-egu23-15820, 2023.

EGU23-15958 | ECS | Posters on site | ESSI2.5

MDE-Thredds: A Django-based plugin for managing THREDDS data server

Mostafa Hadizadeh, Philipp S Sommer, Christof Lorenz, and Linda Baldewein

The Thematic Real-Time Environmental Distributed Data Services (THREDDS) Data Server is an open-source, Java-based web application that enables metadata and data access to scientific netCDF datasets. In recent years, more and more research institutes implemented THREDDS to give researchers and other end-users access to a wide range of real-time and archival data sets from earth system sciences.

A number of features and interfaces are provided by THREDDS that facilitate the interactive and automated exploration, standardization and use of data like the automated generation of ISO-formatted metadata files or the provision of OGC-services (WMS and WCS). However, the configuration of THREDDS via XML-catalogs remains difficult and is usually restricted to system admins. And particularly the publication and consistent maintenance of a large number of datasets is prone to errors and hence proves to be difficult and time-consuming.

Within the Model Data Explorer (MDE, https://model-data-explorer.readthedocs.io), a cross-institutional project to simplify a FAIR publication of model data on the web, we develop a module to overcome these configuration issues and enable scientists to make their environmental research data available on the web. This MDE-THREDDS module manages the catalogs and configurations of the THREDDS data server by providing a user-friendly web-interface for handling major components of THREDDS, including catalogs and web services. A flexible permission system enables scientists and other data producers to add and update their own datasets without the need for manually editing the underlying THREDDS catalogs. This permission system further allows server administrators to moderate and facilitate the publication of data on the web by scientists and other end-users which, hence, ensures a standardized and consistent THREDDS catalog infrastructure.

Overall, with MDE-THREDDS, we want to give scientists and other data producers a simple and user-friendly framework for making their research data open and FAIR through a wide range of standardized and well-established web interfaces.

How to cite: Hadizadeh, M., Sommer, P. S., Lorenz, C., and Baldewein, L.: MDE-Thredds: A Django-based plugin for managing THREDDS data server, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15958, https://doi.org/10.5194/egusphere-egu23-15958, 2023.

EGU23-16396 | ECS | Orals | ESSI2.5

Methodologies and techniques to engage geoscience researchers in the technical design process and product development

Pavel Golodoniuc, Vincent Fazio, YunLong Li, Neda Taherifah, Jens Klump, and Lesley Wyborn

AuScope is Australia’s premier research infrastructure provider to the national geoscience community, working on fundamental geoscience questions and grand challenges for the common good and into the future. The organisation is funded by the Australian Government via the National Collaborative Research Infrastructure Strategy (NCRIS). One of its programs – the AuScope Virtual Research Environment (AVRE) – provides a unifying technological platform for AuScope Programs’ data and analytical needs and increases the uptake of data-driven research through various outreach activities. One such activity was the development of the AVRE Build Program that successfully ran over the past three years and aimed to improve the engagement of research teams and assist with translating scientific requirements into reusable solutions ranging from data management to numerical modelling and complex data visualisation.

Developing scientific software solutions with the diverse backgrounds of stakeholders involved is challenging in itself. In a multidisciplinary environment, we had to collaboratively develop and strengthen our design approach to break the “language barrier” between scientists and technologists to achieve greater user acceptance and ongoing adoption of developed solutions. Our approach stems from the Rapid Application Development and the Agile project management methodologies – both popular and widely applied in the realm of software engineering.

We take a user-centred design approach and involve researchers with a vested interest in the project outcomes in all stages of the iterative development lifecycle. We pay particular attention to the definition of project success and a minimum viable project, requirements analysis, wireframing and prototyping through the project launch and handover phases. Organising projects into short, focused sprints with the direct involvement of researchers has allowed us to stay focused on our objectives, deliver projects in short timeframes, and maintain momentum. Through this process and the direct involvement of researchers in the design aspects of the product, we fostered a close collaborative relationship with our users, created a sense of ownership and, as a result, cemented the longevity of the project under the researchers’ custodianship.

Herein, we detail our approach to scientific software development, the social aspects of our experience of cross-institutional and cross-domain collaboration, the challenges we have experienced, and the successes we have achieved. Although still offering room for improvement, the methodologies we employ have proved successful over the last three years, producing low-maintenance tools that are freely accessible to researchers. They helped to engage a wider audience and improve the speed of science delivery, which inspired other projects within the CSIRO Mineral Resources Business Unit and external organisations to implement similar programs.

How to cite: Golodoniuc, P., Fazio, V., Li, Y., Taherifah, N., Klump, J., and Wyborn, L.: Methodologies and techniques to engage geoscience researchers in the technical design process and product development, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16396, https://doi.org/10.5194/egusphere-egu23-16396, 2023.

EGU23-17052 | Posters on site | ESSI2.5

A flexible yet sustainable Spatial Data Infrastructure for the Integration of Distributed Research Data

Peter Konopatzky, Robin Heß, Roland Koppe, and Andreas Walter

The need for discoverability and accessibility of research data and metadata is huge, driven both by the FAIR principles and user requirements regarding research data portals, repositories and search engines. Interactive, visual and especially map-based exploration of research data is becoming increasingly popular. Bringing together technical reality and custom user vision in the design and provision of services for interactive map viewers without sacrificing sustainability can be challenging.

Thanks to our O2A Spatial framework, the classic Spatial Data Infrastructure (SDI) components — storage, database, geo web server, catalogue — can be (re)deployed and configured quickly and with low effort. This includes the creation and curation of data products which can be compiled from differently-sourced data. The list of currently supported data sources contains the PANGAEA repository, the Observations to Archives and Analysis (O2A) pipeline, Sensor Observation Services (SOS) and data provided by scientists directly. Simple metadata harmonisation is possible. Public available Standard Operating Procedures (SOPs) and data exchange specifications document the ways in which scientists and institutes can have their desired products hosted.

The modular, scalable, flexible and highly automated SDI has been developed and operated at Alfred Wegener Institute (AWI) for more than a decade, continuously improving and providing map services for GIS clients and portals including the Marine Data and Earth Data Portals (see ESSI4.1).

Long-term maintainability is ensured through the use of common open-source technologies, established geodata standards, containerisation and the high degree of automation. The modularity of O2A Spatial and SDI components ensures flexibility and future expandability. Being embedded into O2A, SDI development and operation is financially and staff-wise secured in the long run.

How to cite: Konopatzky, P., Heß, R., Koppe, R., and Walter, A.: A flexible yet sustainable Spatial Data Infrastructure for the Integration of Distributed Research Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17052, https://doi.org/10.5194/egusphere-egu23-17052, 2023.

EGU23-17491 | Posters on site | ESSI2.5

Challenges in Developing a Software Architecture for a National Research Data Infrastructure in Earth System Sciences

Christin Henzen, Auriol Degbelo, and Daniel Nüst

Research data infrastructures (RDIs) are evolving and driven by diverse initiatives ranging from international to local, e.g., institutional scales, who try to build sustainable and useful services. The German national data infrastructure for Earth System Sciences (NFDI4Earth) aims to support researchers in 1) discovering and exploring relevant data sources, 2) data publication and curation, 3) solving research data management problems and 4) creating and publishing information products. In the development of the software architecture for NFDI4Earth, we face challenges of computational, social, cultural, and strategic nature. Here, we are going to present an overview of these challenges and early outcomes and to reflect first lessons learned from the initial concept and development phase of the NFDI4Earth architecture.

Starting with the (meta) data layer, the landscape of existing ESS services and repositories is diverse and features various metadata and data, like governmental (meta) data following INSPIRE, OGC, and ISO19xxx. This diversity demands harmonising and linking concepts that fit to standards for metadata, data and services, such as OGC APIs or ISO19xxx, as well as to Semantic Web concepts, e.g., FAIR Digital Objects, and provide extension points for (newly developed) specific formats. At the same time, the software stack and technologies of the business layer should consider interoperability, openness and sustainability aspects while providing a flexible solution to manage the distributed metadata. Moreover, in our case, activities on developing (meta) data and business layer concepts also include coordinating a software developer team with different scientific and technological backgrounds spread across several institutions.

NFDI4Earth is located in a dynamic landscape of ESS services and repositories which are often not sustainably funded. Hence, we need to implement practices and collaborations to link or integrate further software, services, and information products so that an up-to-date living and evolving architecture serves the needs of researchers.

How to cite: Henzen, C., Degbelo, A., and Nüst, D.: Challenges in Developing a Software Architecture for a National Research Data Infrastructure in Earth System Sciences, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17491, https://doi.org/10.5194/egusphere-egu23-17491, 2023.

EGU23-2579 | Posters on site | ESSI2.8

Using FAIR and Open Science practices to better understand vegetation browning in Troms and Finnmark (Norway)

Jean Iaquinta and Anne Fouilloux

In most places on the planet vegetation thrives: it is known as “greening Earth”. However in certain regions, especially in the Arctic, there are areas exhibiting a browning trend. This phenomenon is well known but not fully understood yet, and grasping its impact on local ecosystems requires involvement of scientists from different disciplines, including social sciences and humanities, as well as local populations. Here we focus on the Troms and Finnmark counties in northern Norway to assess the extent of the problem and any link with local environmental conditions as well as potential impacts.

We have chosen to adopt an open and collaborative process and take advantage of the services offered by RELIANCE on the European Open Science Cloud (EOSC). RELIANCE delivers a suite of innovative and interconnected services that extend the capabilities of the European Open Science Cloud (EOSC) to support the management of the research lifecycle within Earth Science Communities and Copernicus Users. The RELIANCE project has delivered 3 complementary technologies: Research Objects (ROs), Data Cubes and AI-based Text Mining. RoHub is a Research Object management platform that implements these 3 technologies and enables researchers to collaboratively manage, share and preserve their research work.

We will show how we are using these technologies along with EGI notebooks to work open and share an executable Jupyter Notebook that is fully reproducible and reusable. We use a number of Python libraries from the Pangeo software stack such as Xarray, Dask and Zarr. Our Jupyter Notebook is bundled with its computational environment, datacubes and related bibliographic resources in an executable Research Object. We believe that this approach can significantly speed up the research process and can drive it to more exploitable results.

Up to now, we have used indices derived from satellite data (in particular Sentinel-2) to assess how the vegetation cover in Troms and Finnmark counties has changed. To go a bit further we are investigating how to relate such information to relevant local parameters obtained from meteorological reanalysis data (ERA5 and ERA5-land from ECMWF). That should give a good basis for training an Artificial Intelligence algorithm and testing it, with the objective of getting an idea about the possibility of “predicting” what is likely to happen in the near future with certain types of vegetation like mosses and lichens which are essential for local populations and animals.

How to cite: Iaquinta, J. and Fouilloux, A.: Using FAIR and Open Science practices to better understand vegetation browning in Troms and Finnmark (Norway), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2579, https://doi.org/10.5194/egusphere-egu23-2579, 2023.

EGU23-3639 | Orals | ESSI2.8

Data Proximate Computation; Multi-cloud approach on European Weather Cloud and Amazon Web Services

Armagan Karatosun, Michael Grant, Vasileios Baousis, Duncan McGregor, Richard Care, John Nolan, and Roope Tervo

Although utilizing the cloud infrastructure for big data processing algorithms is increasingly common, the challenges of utilizing cloud infrastructures efficiently and effectively are often underestimated. This is especially true in multi-cloud scenarios where data are available only on a subset of the participating clouds. In this study, we have iteratively developed a solution enabling efficient access to ECMWF’s Numerical Weather Prediction (NWP) and EUMETSAT’s satellite data on the European Weather Cloud [1], in combination with UK Met Office assets in Amazon Web Services (AWS), in order to provide a common template for multi-cloud processing solutions in meteorological application development and operations in Europe.

Dask [2] was chosen as the computing framework due to its widespread use in the meteorological community, its ability to automatically spread processing, and its flexibility in changing how workloads are distributed across physical or virtualized infrastructures while maintaining scalability. However, the techniques used here are generally applicable to other frameworks. The primary limitation in using Dask is that all nodes should be able to intercommunicate freely, which is a serious limitation when nodes are distributed over multiple clouds. Although it is possible to route between multiple cloud environments over the Internet, this introduces considerable administrative work (firewalls, security) as well as networking complexities (e.g., due to extensive use of potentially-clashing private IP ranges and NAT in clouds, or cost for public IPs). Virtual Private Networks (VPNs) can hide these issues, but many use a hub-and-spokes model, meaning that communications between workers pass through a central hub. By use of a mesh network VPN (WireGuard) between clusters using IPv6 private addressing, all these difficulties can be avoided, in addition to providing a simplified network addressing scheme with extremely high scalability. Another challenge was to ensure the Dask worker nodes were aware of data locality, both in terms of placing work near data and in terms of minimizing transfers. Here, the UK Met Office’s work on labeling resource pools (in this case, data) and linking scheduling decisions to labels was the key.

In summary, by adapting Dask's concept of resourcing [3] into resource pools [4], building an automated start-up process, and effectively utilizing self-configuring IPv6 VPN mesh networks, we managed to provide a “cloud-native” transient model where all resources can be easily created and disposed of as needed. The resulting “throwaway” multi-cloud Dask framework is able to efficiently place processing on workers proximate to the data while minimizing necessary data traffic between clouds, thus achieving results more quickly and cheaper than naïve implementations, and with a simple, automated setup suitable for meteorological developers. The technical basis of this work was published on the Dask blog [5] but is covered more holistically here, particularly regarding the application side and challenges of developing cloud-native applications which can effectively utilize modern multi-cloud environments, with future applicability to distributed (e.g., Kubernetes) and serverless computing models.

References:

[1] https://www.europeanweather.cloud
[2] https://www.dask.org
[3] https://distributed.dask.org/en/stable/resources.html
[4] https://github.com/gjoseph92/dask-worker-pools
[5] https://blog.dask.org/2022/07/19/dask-multi-cloud

How to cite: Karatosun, A., Grant, M., Baousis, V., McGregor, D., Care, R., Nolan, J., and Tervo, R.: Data Proximate Computation; Multi-cloud approach on European Weather Cloud and Amazon Web Services , EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3639, https://doi.org/10.5194/egusphere-egu23-3639, 2023.

EGU23-3738 | Orals | ESSI2.8

An Open-innovation and Open-development Framework for the Unified Forecast System Powered by the Earth Prediction Innovation Center

Maoyi Huang

The National Oceanic and Atmospheric Administration (NOAA) established the Earth Prediction Innovation Center (EPIC) to be the catalyst for community research and modeling focused on informing and accelerating advances in our nation’s operational NWP forecast modeling systems. The Unified Forecast System (UFS) is a community-based, coupled, comprehensive Earth modeling system. The UFS numerical applications span local to global domains and predictive time scales from sub-hourly analyses to seasonal predictions. It is designed to support the Weather Enterprise and to be the source system for NOAA‘s operational numerical weather prediction applications. EPIC applies an open-innovation and open-development framework that embraces open-source code repositories integrated with automated Continuous Integration/Continuous Deployment (CI/CD) pipelines on cloud and on-prem HPCs. EPIC also supports UFS public releases, tutorials and training opportunities (e.g., student workshops, hackathons, and codesprints), and advanced user support via a virtual community portal (epic.noaa.gov). This framework allows community developers to track the status of their contributions, and facilitate rapid incorporation of innovation by implementing consistent and transparent, standardized and community-driven validation and verification tests. In this presentation, I will demonstrate capabilities in the EPIC framework using the UFS Short-range Weather (SRW) Application as an example in the follow aspects:

Public Releases of a Cloud-ready UFS SRW application with a scalable container following a modernize continuous release paradigm
Test cases for challenging forecast environments released with datasets
Training and Tutorials for users and developers
Baseline for benchmarking in skill and computation on cloud HPCs , and
An Automated CI/CD pipeline to enable seamless transition to operations

How to cite: Huang, M.: An Open-innovation and Open-development Framework for the Unified Forecast System Powered by the Earth Prediction Innovation Center, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3738, https://doi.org/10.5194/egusphere-egu23-3738, 2023.

EGU23-4298 | Orals | ESSI2.8

BUILDSPACE: Enabling Innovative Space-driven Services for Energy Efficient Buildings and Climate Resilient Cities

Stamatia Rizou, Vaggelis Marinakis, Gema Hernández Moral, Carmen Sánchez-Guevara, Luis Javier Sánchez-Aparicio, Ioannis Brilakis, Vasileios Baousis, Tijs Maes, Vassileios Tsetsos, Marco Boaria, Piotr Dymarski, Michail Bourmpos, Petra Pergar, and Inga Brieze

BUILDSPACE aims to couple terrestrial data from buildings (collected by IoT platforms, BIM solutions and other) with aerial imaging from drones equipped with thermal cameras and location annotated data from satellite services (i.e., EGNSS and Copernicus) to deliver innovative services for the building and urban stakeholders and support informed decision making towards energy-efficient buildings and climate resilient cities. The platform will allow integration of these heterogeneous data and will offer services at building scale, enabling the generation of high fidelity multi-modal digital twins and at city scale providing decision support services for energy demand prediction, urban heat and urban flood analysis. The services will enable the identification of environmental hotspots that increase pressure to local city ecosystems and raise probability for natural disasters (such as flooding) and will issue alerts and recommendations for action to local governments and regions (such as the support of policies for building renovation in specific vulnerable areas). BUILDSPACE services will be validated and assessed in four European cities with different climate profiles. The digital twin services at building level will be tested during the construction of a new building in Poland, and the city services validating the link to digital twin of buildings will be tested in 3 cities (Piraeus, Riga, Ljubljana) across EU. BUILDSPACE will create a set of replication guidelines and blueprints for the adoption of the proposed applications in building resilient cities at large.

How to cite: Rizou, S., Marinakis, V., Hernández Moral, G., Sánchez-Guevara, C., Sánchez-Aparicio, L. J., Brilakis, I., Baousis, V., Maes, T., Tsetsos, V., Boaria, M., Dymarski, P., Bourmpos, M., Pergar, P., and Brieze, I.: BUILDSPACE: Enabling Innovative Space-driven Services for Energy Efficient Buildings and Climate Resilient Cities, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4298, https://doi.org/10.5194/egusphere-egu23-4298, 2023.

EGU23-5807 | Orals | ESSI2.8

The EuroHPC Center of Excellence for Exascale in Solid Earth

Arnau Folch, Josep DelaPuente, Antonio Costa, Benedikt Halldórson, Jose Gracia, Piero Lanucara, Michael Bader, Alice-Agnes Gabriel, Jorge Macías, Finn Lovholt, Vadim Montellier, Alexandre Fournier, Erwan Raffin, Thomas Zwinger, Clea Denamiel, Boris Kaus, and Laetitia le Pourhiet

The second phase (2023-2026) of the Center of Excellence for Exascale in Solid Earth (ChEESE-2P), funded by HORIZON-EUROHPC-JU-2021-COE-01 under the Grant Agreement No 101093038, will prepare 11 European flagship codes from different geoscience domains (computational seismology, magnetohydrodynamics, physical volcanology, tsunamis, geodynamics, and glacier hazards). Codes will be optimised in terms of performance on different types of accelerators, scalability, containerisation, and continuous deployment and portability across tier-0/tier-1 European systems as well as on novel hardware architectures emerging from the EuroHPC Pilots (EuPEX/OpenSequana and EuPilot/RISC-V) by co-designing with mini-apps. Flagship codes and workflows will be combined to farm a new generation of 9 Pilot Demonstrators (PDs) and 15 related Simulation Cases (SCs) representing capability and capacity computational challenges selected based on their scientific importance, social relevance, or urgency. The SCs will produce relevant EOSC-enabled datasets and enable services on aspects of geohazards like urgent computing, early warning forecast, hazard assessment, or fostering an emergency access mode in EuroHPC systems for geohazardous events including access policy recommendations. Finally, ChEESE-2P will liaise, align, and synergise with other domain-specific European projects on digital twins and longer-term mission-like initiatives like Destination Earth.

How to cite: Folch, A., DelaPuente, J., Costa, A., Halldórson, B., Gracia, J., Lanucara, P., Bader, M., Gabriel, A.-A., Macías, J., Lovholt, F., Montellier, V., Fournier, A., Raffin, E., Zwinger, T., Denamiel, C., Kaus, B., and le Pourhiet, L.: The EuroHPC Center of Excellence for Exascale in Solid Earth, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5807, https://doi.org/10.5194/egusphere-egu23-5807, 2023.

EGU23-6768 | ECS | Orals | ESSI2.8

SarXarray: an Xarray extension for SLC SAR data processing

Ou Ku, Francesco Nattino, Meiert Grootes, Pranav Chandramouli, and Freek van Leijen

Satellite-based Interferometric Synthetic Aperture Radar (InSAR) plays a significant role for numerous surface motion monitoring applications, e.g. civil-infrastructure stability, hydrocarbons extraction, etc. InSAR monitoring is based on a coregistered stack of Single Look Complex (SLC) SAR images. Due to the long temporal coverage, broad spatial coverage and high spatio-temporal resolution of an SLC SAR stack, handling it in an efficient way is a common challenge within the community. Aiming to meet this need, we present SarXarray: an open-source Xarray extension for SLC SAR stack processing. SarXarray provides a Python interface to read and write a coregistered stack of SLC SAR data, with basic SAR processing functions. It utilizes Xarray’s support on labeled multi-dimensional datasets to stress the space-time character of an SLC SAR stack. It also leverages Dask to perform lazy evaluation of the operations. SarXarray can be integrated to existing Python workflows in a flexible way. We provide a case study of creating a SAR Mean Reflectivity Map to demonstrate the functionality of SarXarray.

How to cite: Ku, O., Nattino, F., Grootes, M., Chandramouli, P., and van Leijen, F.: SarXarray: an Xarray extension for SLC SAR data processing, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6768, https://doi.org/10.5194/egusphere-egu23-6768, 2023.

EGU23-6857 | ECS | Posters on site | ESSI2.8

Convergence of HPC, Big Data and Machine Learning for Earth System workflows

Donatello Elia, Sonia Scardigno, Alessandro D'Anca, Gabriele Accarino, Jorge Ejarque, Francesco Immorlano, Daniele Peano, Enrico Scoccimarro, Rosa M. Badia, and Giovanni Aloisio

Typical end-to-end Earth System Modelling (ESM) workflows rely on different steps including data pre-processing, numerical simulation, output post-processing, as well as data analytics and visualization. The approaches currently available for implementing scientific workflows in the climate context do not properly integrate the entire set of components into a single workflow and in a transparent manner. The increasing usage of High Performance Data Analytics (HPDA) and Machine Learning (ML) in climate applications further exacerbate the issues. A more integrated approach would allow to support next-generation ESM and improve the workflow in terms of execution and energy consumption.

Moreover, a seamless integration of components for HPDA and ML into the ESM workflow will open the floor to novel applications and support larger scale pre- and post-processing. However, these components typically have different deployment requirements spanning from HPC (for ESM simulation) to Cloud computing (for HPDA and ML). It is paramount to provide scientists with solutions capable of hiding the technical details of the underlying infrastructure and improving workflow portability.

In the context of the eFlows4HPC project, we are exploring the use of innovative workflow solutions integrating approaches from HPC, HPDA and ML for supporting end-to-end ESM simulations and post-processing, with a focus on extreme events analysis (e.g., heat waves and tropical cyclones). In particular, the envisioned solution exploits PyCOMPSs for the management of parallel pipelines, task orchestration and synchronization, as well as PyOphidia for climate data analytics and ML frameworks (i.e., TensorFlow) for data-driven event detection models. This contribution presents the approaches being explored in the frame of the project to address the convergence of HPC, Big Data and ML into a single end-to-end ESM workflows.

How to cite: Elia, D., Scardigno, S., D'Anca, A., Accarino, G., Ejarque, J., Immorlano, F., Peano, D., Scoccimarro, E., Badia, R. M., and Aloisio, G.: Convergence of HPC, Big Data and Machine Learning for Earth System workflows, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6857, https://doi.org/10.5194/egusphere-egu23-6857, 2023.

EGU23-6960 | Orals | ESSI2.8

Remote Sensing Deployable Analysis environmenT

Pranav Chandramouli, Francesco Nattino, Meiert Grootes, Ou Ku, Fakhereh Alidoost, and Yifat Dzigan

Remote-sensing (RS) and Earth observation (EO) data have become crucial in areas ranging from science to policy, with their use expanding beyond the ‘usual’ fields of geosciences to encompass ‘green’ life sciences, agriculture, and even social sciences. Within this context, the RS-DAT project has developed and made available a readily deployable framework enabling researchers to scale their analysis of EO and RS data on HPC systems and associated storage resources. Building on and expanding the established tool stack of the Pangeo Community, the framework integrates tools to access, retrieve, explore, and process geospatial data, addressing common needs identified in the EO domain. On the computing side RS-DAT leverages Jupyter (Python), which provides users a web-based interface to access (remote) computational resources, and Dask, which enables to scale analysis and workflows to large computing systems. Both Jupyter and Dask are well-established tools in the Pangeo community and can be deployed in several ways and on different infrastructures. RS-DAT provides an easy-to-use deployment framework for two targets: the generic case of SLURM-based HPC systems (for example, Dutch Supercomputer Snellius/Spider) which offer flexibility in computational resources; and the special case of an ansible-based cloud-computing infrastructure (Surf Research Cloud (SRC)) which is more straight-forward for the user but less flexible. Both these frameworks enable the easy scale-up of workflows, using HPCs, to access, manipulate and process large-scale datasets as commonly found in EO. On the data access and storage side RS-DAT integrates two python packages, STAC2dCache and dCacheFS, which were developed to facilitate data retrieval from online STAC catalogs (STAC2dCache) and its storage on the HPC system or local mass storage, specifically dCache. This ensures efficient computation for large-scale analyses where data retrieval and handling can cause significant bottlenecks. User-defined input/output to Zarr file format is also supported within the framework. We present an application of the tools developed to the calculation of leaf-spring indices for North America using the Daymet dataset at a 1km resolution for 42 years (~940 GiB, completed in under 5 hours using 60 cores on the Dutch supercomputing system) and look forward to on-going work integrating both deployment targets in the case of the Dutch HPC ecosystem.

How to cite: Chandramouli, P., Nattino, F., Grootes, M., Ku, O., Alidoost, F., and Dzigan, Y.: Remote Sensing Deployable Analysis environmenT, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6960, https://doi.org/10.5194/egusphere-egu23-6960, 2023.

EGU23-7825 | Orals | ESSI2.8

Scaling up a Sentinel 1 change detection pipeline using the Julia programming language

Fabian Gans and Felix Cremer

EGU23-8096 | ECS | Posters on site | ESSI2.8

Spatio-Temporal Asset Catalog (STAC) for in-situ data

Justus Magin and Tina Odaka

In order to make use of a collection of datasets – for example, scenes from a SAR satellite – more efficient, it is important to be able to search for datasets relevant for a specific application. In particular, one might want to search for a specific period in time, for the spatial extent, or perform searches over multiple collections together.

For SAR data or data obtained from optical satellites, Spatio-Temporal Asset Catalogs (STAC) have become increasingly popular in the past few years. Defined as JSON and backed by databases with geospatial extensions, STAC servers (endpoints) have the advantage of being efficient, language-agnostic and following a standardized API.

Just like satellite scenes, in-situ data is growing in size very quickly and thus would benefit from being catalogued. However, the sequential nature of in-situ data and its sparse distribution in space makes it difficult to fit into STAC's standard model.

In the session, we present a experimental STAC extension that defines the most common properties of in-situ data as identified from ArgoFloat and biologging data.

How to cite: Magin, J. and Odaka, T.: Spatio-Temporal Asset Catalog (STAC) for in-situ data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8096, https://doi.org/10.5194/egusphere-egu23-8096, 2023.

EGU23-8756 | Posters on site | ESSI2.8

Pangeo framework for training: experience with FOSS4G, the CLIVAR bootcamp and the eScience course

Anne Fouilloux, Pier Lorenzo Marasco, Tina Odaka, Ruth Mottram, Paul Zieger, Michael Schulz, Alejandro Coca-Castro, Jean Iaquinta, and Guillaume Eynard Bontemps

The ever increasing number of scientific datasets made available by authoritative data providers (NASA, Copernicus, etc.) and provided by the scientific community opens new possibilities for advancing the state of the art in many areas of the natural sciences. As a result, researchers, innovators, companies and citizens need to acquire computational and data analysis skills to optimally exploit these datasets. Several educational programs dispense basic courses to students, and initiatives such as “The Carpentries” (https://carpentries.org/) complement this offering but also reach out to established researchers to fill the skill gap thereby empowering them to perform their own data analysis. However, most researchers find it challenging to go beyond these training sessions and face difficulties when trying to apply their newly acquired knowledge to their own research projects. To this regard, hackathons have proven to be an efficient way to support researchers in becoming competent practitioners but organising good hackathons is difficult and time consuming. In addition, the need for large amounts of computational and storage resources during the training and hackathons requires a flexible solution. Here, we propose an approach where researchers work on realistic, large and complex data analysis problems similar to or directly part of their research work. Researchers access an infrastructure deployed on the European Ocean Science Cloud (EOSC) that supports intensive data analysis (large compute and storage resources). EOSC is a European Commission initiative for providing a federated and open multi-disciplinary environment where data, tools and services can be shared, published, found and re-used. We used jupyter book for delivering a collection of FAIR training materials for data analysis relying on Pangeo EOSC deployments as its primary computing platform. The training material (https://pangeo-data.github.io/foss4g-2022/intro.html, https://pangeo-data.github.io/clivar-2022/intro.html, https://pangeo-data.github.io/escience-2022/intro.html) is customised (different datasets with similar analysis) for different target communities and participants are taught the usage of Xarray, Dask and more generally how to efficiently access and analyse large online datasets. The training can be completed by group work where attendees can work on larger scale scientific datasets: the classroom is split into several groups. Each group works on different scientific questions and may use different datasets. Using the Pangeo (http://pangeo.io) ecosystem is not always new for all attendees but applying Xarray (http://xarray.pydata.org) and Dask (https://www.dask.org/) on actual scientific “mini-projects” is often a showstopper for many researchers. With this approach, attendees have the opportunity to ask questions, collaborate with other researchers as well as Research Software Engineers, and apply Open Science practices without the burden of trying and failing alone. We find the involvement of scientific computing research engineers directly in the training is crucial for success of the hackathon approach. Feedback from attendees shows that it provides a solid foundation for big data geoscience and helps attendees to quickly become competent practitioners. It also gives infrastructure providers and EOSC useful feedback on the current and future needs of researchers for making their research FAIR and open. In this presentation, we will provide examples of achievements from attendees and present the feedback EOSC providers have received.

How to cite: Fouilloux, A., Marasco, P. L., Odaka, T., Mottram, R., Zieger, P., Schulz, M., Coca-Castro, A., Iaquinta, J., and Eynard Bontemps, G.: Pangeo framework for training: experience with FOSS4G, the CLIVAR bootcamp and the eScience course, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8756, https://doi.org/10.5194/egusphere-egu23-8756, 2023.

EGU23-9095 | Posters on site | ESSI2.8

Pangeo@EOSC: deployment of PANGEO ecosystem on the European Open Science Cloud

Guillaume Eynard-Bontemps, Jean Iaquinta, Sebastian Luna-Valero, Miguel Caballer, Frederic Paul, Anne Fouilloux, Benjamin Ragan-Kelley, Pier Lorenzo Marasco, and Tina Odaka

Research projects heavily rely on the exchange and processing of data and in this context Pangeo (https://pangeo.io/), a world-wide community of scientists and developers, thrives to facilitate the deployment of ready to use and community-driven platforms for big data geoscience. The European Open Science Cloud (EOSC) is the main initiative in Europe for providing a federated and open multi-disciplinary environment where European researchers, innovators, companies and citizens can share, publish, find and re-use data, tools and services for research, innovation and educational purposes. While a number of services based on Jupyter Notebooks were already available, no public Pangeo deployments providing fast access to large amounts of data and compute resources were accessible on EOSC. Most existing cloud-based Pangeo deployments are USA-based, and members of the Pangeo community in Europe did not have a shared platform where scientists or technologists could exchange know-how. Pangeo teamed up with two EOSC projects, namely EGI-ACE (https://www.egi.eu/project/egi-ace/) and C-SCALE (https://c-scale.eu/) to demonstrate how to deploy and use Pangeo on EOSC and emphasise the benefits for the European community.

The Pangeo Europe Community together with EGI deployed a DaskHub, composed of Dask Gateway (https://gateway.dask.org/) and JupyterHub (https://jupyter.org/hub), with Kubernetes cluster backend on EOSC using the infrastructure of the EGI Federation (https://www.egi.eu/egi-federation/). The Pangeo EOSC JupyterHub deployment makes use of 1) the EGI Check-In to enable user registration and thereby authenticated and authorised access to the Pangeo JupyterHub portal and to the underlying distributed compute infrastructure; and 2) the EGI Cloud Compute and the cloud-based EGI Online Storage to distribute the computational tasks to a scalable compute platform and to store intermediate results produced by the user jobs.

To facilitate future Pangeo deployments on top of a wide range of cloud providers (AWS, Google Cloud, Microsoft Azure, EGI Cloud Computing, OpenNebula, OpenStack, and more), the Pangeo EOSC JupyterHub deployment is now possible through the Infrastructure Manager (IM) Dashboard (https://im.egi.eu/im-dashboard/login). All the computing and storage resources are currently supplied by CESNET (https://www.cesnet.cz/?lang=en) in the frame of EGI-ACE project (https://im.egi.eu/). Several deployments have been made to serve the geoscience community, both for teaching and for research work. To date, more than 100 researchers have been trained on Pangeo@EOSC deployments and more are expected to join, in particular with easy access to large amounts of Copernicus data through a recent collaboration established with the C-SCALE project. In this presentation, we will provide details on the different deployments, how to get access to JupyterHub deployments and more generally how to contribute to Pangeo@EOSC.

How to cite: Eynard-Bontemps, G., Iaquinta, J., Luna-Valero, S., Caballer, M., Paul, F., Fouilloux, A., Ragan-Kelley, B., Marasco, P. L., and Odaka, T.: Pangeo@EOSC: deployment of PANGEO ecosystem on the European Open Science Cloud, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9095, https://doi.org/10.5194/egusphere-egu23-9095, 2023.

EGU23-10697 | Orals | ESSI2.8 | Highlight

The Joint Effort for Data Assimilation Integration (JEDI): A unified data assimilation framework for Earth system prediction supported by NOAA, NASA, U.S. Navy, U.S. Air Force, and UK Met Office

Dom Heinzeller, Maryam Abdi-Oskouei, Stephen Herbener, Eric Lingerfelt, Yannick Trémolet, and Tom Auligné

The Joint Effort for Data assimilation Integration (JEDI), is an innovative data assimilation system for Earth system prediction, spearheaded by the Joint Center for Satellite Data Assimilation (JCSDA) and slated for implementation in major operational modeling systems across the globe in the coming years. Funded as an inter-agency development by NOAA, NASA, the U.S. Navy and Air Force, and with contributions from the UK Met Office, JEDI must operate on a wide range of computing platforms. The recent move towards cloud computing systems puts portability, adaptability and performance across systems, from dedicated High Performance Computing systems to commercial clouds and workstations, in the critical path for the success of JEDI.

JEDI is a highly complex application that relies on a large number of third-party software packages to build and run. These packages can include I/O libraries, workflow engines, Python modules for data manipulation and plotting, several ECMWF libraries for complex arithmetics and grid manipulations, and forecast models such as the Unified Forecast System (UFS), the Goddard Earth Observing System (GEOS), the Modular Ocean Model (MOM6), the Model for Prediction across Scales (MPAS), the Navy Environmental Prediction sysTem Utilizing the NUMA corE (NEPTUNE), and the Met Office Unified Model (UM).

With more than 100 contributors and rapid code development it is critical to perform thorough automated testing, from basic unit tests to comprehensive end-to-end-tests. This presentation summarizes recent efforts to leverage cloud computing environments for research, development, and near real-time applications of JEDI, as well as for developing a Continuous Integration/Continuous Delivery (CI/CD) pipeline. These efforts rest on a newly developed software stack called spack-stack, a joint effort of JCSDA, the NOAA Environmental Modeling Center (EMC) and the U.S. Earth Prediction Innovation Center (EPIC). Automatic testing in JEDI is implemented with modern software development tools such as GitHub, Docker containers, various Amazon Web Services (AWS), and CodeCov for testing and evaluation of code performance. End-to-end testing is realized in JCSDA’s newly developed Skylab Earth system data assimilation application, which combines JEDI with the Research Repository for Data and Diagnostics (R2D2) and the Experiments and Workflow Orchestration Kit (EWOK), and which leverages the AWS Elastic Compute Cloud (EC2) for testing, research, development and production.

How to cite: Heinzeller, D., Abdi-Oskouei, M., Herbener, S., Lingerfelt, E., Trémolet, Y., and Auligné, T.: The Joint Effort for Data Assimilation Integration (JEDI): A unified data assimilation framework for Earth system prediction supported by NOAA, NASA, U.S. Navy, U.S. Air Force, and UK Met Office, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10697, https://doi.org/10.5194/egusphere-egu23-10697, 2023.

EGU23-11117 | Orals | ESSI2.8

Modeling the Earth System on Modular Supercomputing Architectures: coupled atmosphere-ocean simulations with ICON

Olaf Stein, Abhiraj Bishnoi, Luis Kornblueh, Lars Hoffmann, Norbert Eicker, Estela Suarez, and Catrin I. Meyer

Significant progress has been made in recent years to develop km-scale versions of global Earth System Models (ESM), combining the chance of replacing uncertain model parameterizations by direct treatment and the improved representation of orographic and land surface features (Schär et al., 2020, Hohenegger et al., 2022). However, adapting climate codes to new hardware and at the same time keeping the performance portability, still remains a major issue. Given the long development cycles, the various maturity of ESM modules and their large code bases, it is not expected that all code parts can be brought to the same level of exascale readiness in the near future. Instead, short term model adaptation strategies need to focus on software abilities as well as hardware availability. Moreover, energy use efficiency is of growing importance on both sides, supercomputer providers and scientific projects employing climate simulations.

Here, we present results from first simulations of the coupled atmosphere-ocean modelling system ICON-v2.6.6-rc on the supercomputing system JUWELS at the Jülich Supercomputing Centre (JSC) with a global resolution of 5 km, using significant parts of the HPC system. While the atmosphere part of ICON (ICON-A) is capable of running on GPUs, model I/O currently performs better on a CPU cluster and the ocean module (ICON-O) has not been ported to modern accelerators yet. Thus, we make use of the modular supercomputing architecture (MSA) of JUWELS and its novel batch job options for the coupled ICON model with ICON-A running on the NVIDIA A100 GPUs of JUWELS Booster, while ICON-O and the model I/O are running simultaneously on the CPUs of the JUWELS Cluster partition. As expected, ICON performance is limited by ICON-A. Thus we chose the performance-optimal Booster-node configuration for ICON-A considering also memory requirements (84 nodes) and adapted ICON-O configuration to achieve minimum waiting times for simultaneous time step execution and data exchange (63 cluster nodes). We compared runtime and energy efficiency to cluster-only simulations (on up to 760 cluster nodes) and found only small improvements in runtime for the MSA case, but energy consumption is already reduced by 26% without further improvements in vector length applied with ICON. When switching to even higher ICON resolutions, cluster-only simulations are not fitting to most of current HPC systems and upcoming exascale systems will rely to a large extent on GPU acceleration. Thus exploiting MSA capabilities is an important step towards performance portable and energy efficient use of km-scale climate models.

References:

Hohenegger et al., ICON-Sapphire: simulating the components of the Earth System and their interactions at kilometer and subkilometer scales, https://doi.org/10.5194/gmd-2022-171, in review, 2022.

Schär et al., Kilometer-Scale Climate Models: Prospects and Challenges, https://doi.org/10.1175/BAMS-D-18-0167.1, 2020.

How to cite: Stein, O., Bishnoi, A., Kornblueh, L., Hoffmann, L., Eicker, N., Suarez, E., and Meyer, C. I.: Modeling the Earth System on Modular Supercomputing Architectures: coupled atmosphere-ocean simulations with ICON, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11117, https://doi.org/10.5194/egusphere-egu23-11117, 2023.

EGU23-12539 | Orals | ESSI2.8

European Weather Cloud: A community cloud tailored for big Earth modelling and EO data processing

Roberto Cuccu, Vasileios Baousis, Umberto Modigliani, Charalampos Kominos, Xavier Abellan, and Roope Tervo

The European Centre for Medium-Range Weather Forecasts (ECMWF) together with the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) have worked together to offer to their Member States a new paradigm to access and consume weather data and services. The “European Weather Cloud-(EWC)” (https://www.europeanweather.cloud/), concluded its pilot phase and is expected to become operational during the first months of 2023.

This initiative aims to offer a community cloud infrastructure on which Member and Co‐operating States of both organizations can create on demand virtual compute (including GPUs) and storage resources to gain easy and high throughput access to the ECMWF’s Numerical Weather Predication (NWP) and EUMETSAT’s satellite data in a timely and configurable fashion. Moreover, one of the main goals is to involve more National Meteorological Services to jointly form a federation of clouds/data offered from their Member States, for the maximum benefit of the European Meteorological Infrastructure (EMI). During the pilot phase of the project, both organizations have jointly hosted user and technical workshops to actively engage with the meteorological community and align the evolution of the EWC to reflect and satisfy their operational goals and needs.

The EWC, in its pilot phase hosted several use cases, mostly aimed at users in the developers’ own organisations. These broad categories of these cases are:

Web services to explore hosted datasets
Data processing applications
Platforms to support the training of machine learning models on archive datasets
Workshops and training courses (e.g., ICON model training, ECMWF training etc)
Research in collaboration with external partners
World Meteorological Organization (WMO) support with pilots and PoC.

Some examples of the use cases currently developed at the EWC are:

The German weather service DWD, which is already feeding maps generated by a server it deployed on the cloud into its public GeoPortal service.
EUMETSAT and ECMWF joint use case assesses bias correction schemes for the assimilation of radiance data based on several satellite data time series
the Royal Netherlands Meteorological Institute (KNMI) hosts a climate explorer web application based on KNMI climate explorer data and ECMWF weather and climate reanalyses
The Royal Meteorological Institute of Belgium prepares ECMWF forecast data for use in a local atmospheric dispersion model.
NordSat, a collaboration of northern European countries which is developing and testing imagery generation tools in preparation for the Meteosat Third Generation (MTG) satellite products.
UK Met Office with the DataProximateCompute use case, which distributes compute workload close to data, with the automatic creation and disposal of Dask clusters, as well as the data plane VPN network, on demand and in heterogeneous cloud environments.

In this presentation, the status of the project, the offered services and how these are accessed by the end users along with examples of the existing use cases will be analysed. The plans, next steps for the evolution of the EWC and its relationship with other projects and initiatives (like DestinE) will conclude the presentation.

How to cite: Cuccu, R., Baousis, V., Modigliani, U., Kominos, C., Abellan, X., and Tervo, R.: European Weather Cloud: A community cloud tailored for big Earth modelling and EO data processing, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12539, https://doi.org/10.5194/egusphere-egu23-12539, 2023.

EGU23-12785 | Orals | ESSI2.8

A Scalable Near Line Storage Solution for Very Big Data

Neil Massey, Jack Leland, and Bryan Lawrence

Managing huge volumes of data is a problem now, and will only become worse with the advent of exascale computing and next generation observational systems. An important recognition is that data needs to be more easily migrated between storage tiers. Here we present a new solution, the Near-Line Data store (NLDS), for managing data migration between user facing storage systems and tape by using an object storage cache. NLDS builds on lessons learned from previous experience developing the ESIWACE funded Joint Data Migration App (JDMA) and deploying it at the Centre for Environmental Data Analysis (CEDA).

CEDA currently has over 50PB of data stored on a range of disk based storage systems. These systems are chosen on cost, power usage and accessibility via a network, and include three different types of POSIX disk and object storage. Tens of PB of additional data are also stored on tape. Each of these systems has different workflows, interfaces and latencies, causing difficulties for users.

NLDS, developed with ESIWACE2 and other funding, is a multi-tiered storage solution using object storage as a front end to a tape library. Users interact with NLDS via a HTTP API, with a Python library and command-line client provided to support both programmatic and interactive use. Files transferred to NLDS are first written to the object storage, and a backup is made to tape. When the object storage is approaching capacity, a set of policies is interrogated to determine which files will be removed from it. Upon retrieving a file, NLDS may have to first transfer the file from tape to the object storage, if it has been deleted by the policies. This implements a multi-tier of hot (disk), warm (object storage) and cold (tape) storage via a single interface. While systems like this are not novel, NLDS is open source, designed for ease of redeployment elsewhere, and for use from both local storage and remote sites.

NLDS is based around a microservice architecture, with a message exchange brokering communication between the microservices, the HTTP API and the storage solutions. The system is deployed via Kubernetes, with each microservice in its own Docker container, allowing the number of services to be scaled up or down, depending on the current load of NLDS. This provides a scalable, power efficient system while ensuring that no messages between microservices are lost. OAuth is used to authenticate and authorise users via a pluggable authentication layer. The use of object storage as the front end to the tape allows both local and remote cloud-based services to access the data, via a URL, so long as the user has the required credentials.

NLDS is a a scalable solution to storing very large data for many users, with a user-friendly front end that is easily accessed via cloud computing. This talk will detail the architecture and discuss how the design meets the identified use cases.

How to cite: Massey, N., Leland, J., and Lawrence, B.: A Scalable Near Line Storage Solution for Very Big Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12785, https://doi.org/10.5194/egusphere-egu23-12785, 2023.

EGU23-12851 | Orals | ESSI2.8

From the Copernicus satellite data to an environmentally aware field decision

Fabien Castel and Emma Rizzi

Tackling complex environmental issues requires accessing and processing a wide range of voluminous data. The Copernicus spatial data is a very complete and valuable source for many earth science domains, in particular thanks to its Core Services (Land, Atmosphere, Marine…). For almost five years now, Copernicus DIAS platforms have provided broad access to the core services products through the cloud. Among these platforms, the Wekeo platform operated by EUMETSAT, Mercator Ocean, ECMWF and EEA provides wider access to Copernicus Core Service data.

However, Copernicus data needs an additional layer of processing and preparation to be presented and understood by the general public and decision makers. Murmuration has developed data processing pipelines to produce environmental indicators from Copernicus data constituting powerful tools to put environmental issues at the centre of decision-making processes.

Throughout its use, limitations on the DIAS platforms were observed. Firstly, the cloud service offerings are basic in comparison to the market leaders (such as AWS and GCP). In particular, there is no built-in solution for automating and managing data processing pipelines, which must be set up at the user's expense. Secondly, the cost of resources is higher than market price. Limiting the activities on DIAS to edge data processing and relying on a cheaper offering for applications not requiring the direct access to raw Copernicus data is a cost effective choice. FInally, the performance and reliability requirements to access the data can sometimes not be met when relying on a single DIAS platform. Implementing a multi-DIAS approach ensures backup data sources. This raises the question of the automation and orchestration of such a multi-cloud system.

We propose an approach combining the wide data offer of the DIAS platforms, the automation features provided by the Prefect platform and the usage of efficient cloud technologies to build a repository of environmental indicators. Prefect is a hybrid orchestration platform dedicated to automation of data processing flows. It does not host any data processing flow itself and rather connects in a cloud-agnostic way to any cloud environment, where periodic and triggered flow executions can be scheduled. Prefect centrally controls flows that run on different cloud environments through a single platform.

Technologies leveraged to build the system allow to efficiently produce and disseminate the environmental indicators: firstly, containerisation and clustering (using Docker and Kubernetes) to manage processing resources; secondly object storage combined with cloud native access (Zarr data format); and finally, the Python scientific software stack (including pandas, scikit-learn, etc.) complemented by the powerful Xarray library. Data processing pipelines ensure a path from the NetCDF Copernicus Core Services products to cloud-native Zarr products. The Zarr format allows windowed read/write operations, avoiding unnecessary data transfers. This efficient data access allows plugging into the data repository fast data dissemination services following well-established OGC standards and feeding interactive dashboards for decision makers. The cycle is complete, from the Copernicus satellite data to an environmentally aware field decision.

How to cite: Castel, F. and Rizzi, E.: From the Copernicus satellite data to an environmentally aware field decision, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12851, https://doi.org/10.5194/egusphere-egu23-12851, 2023.

EGU23-13768 | ECS | Posters on site | ESSI2.8

FAIR Notebooks: opportunities and challenges for the geoscience community

Alejandro Coca-Castro, Anne Fouilloux, J. Scott Hosking, and Environmental Data Science Book community

Making assets in scientific research Findable, Accessible, Interoperable and Reusable (FAIR) is still overwhelming for many scientists. When considered as an afterthought, FAIR research is indeed challenging, and we argue that its implementation is by far much easier when considered at an early stage and focusing on improving the researchers' day to day work practices. One key aspect is to bundle all the research artefacts in a FAIR Research Object (RO) using RoHub (https://reliance.rohub.org/), a Research Object management platform that enables researchers to collaboratively manage, share and preserve their research work (data, software, workflows, models, presentations, videos, articles, etc.). RoHub implements the full RO model and paradigm: resources associated to a particular research work are aggregated into a single FAIR digital object, and metadata relevant for understanding and interpreting the content is represented as semantic metadata that are user and machine readable. This approach provides the technical basis for implementing FAIR executable notebooks: the data and the computational environment can be “linked” to one or several FAIR notebooks that can then be executed via EGI Binder Service with scalable compute and storage capabilities. However, the need for defining clear practises for writing and publishing FAIR notebooks that can be reused to build upon new research has quickly arised. This is where a community of practice is required. The Environmental Data Science Book (or EDS Book) is a pan-european community-driven resource hosted on GitHub and powered by Jupyter Book. EDS Book provides practical guidelines and templates that help to translate research outputs into curated, interactive, shareable and reproducible executable notebooks. The quality of the FAIR notebooks is ensured by a collaborative and transparent reviewing process supported by GitHub related technologies. This approach provides immediate benefits for those who adopt it and can feed fruitful discussions to better define a reward system that would benefit Science and scientific communities. All the resources needed for understanding and executing the notebook are gathered into an executable Research Object in RoHub. To date, the community has successfully published ten FAIR notebooks covering a wide range of topics in environmental data science. The notebooks consume open-source python libraries e.g. intake, iris, xarray, hvplot for fetching, processing and interactively visualising environmental research. While these notebooks are currently python-based, EDS Book supports other programming languages such as R and Julia, and we are aiming at engaging with computational notebooks communities alike towards improving the research practices in environmental science.

How to cite: Coca-Castro, A., Fouilloux, A., Hosking, J. S., and community, E. D. S. B.: FAIR Notebooks: opportunities and challenges for the geoscience community, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13768, https://doi.org/10.5194/egusphere-egu23-13768, 2023.

EGU23-14507 | Orals | ESSI2.8

geokube: A Python Package for Data Analysis and Visualization in Geoscience

Marco Mancini, Mirko Stojiljkovic, and Jakub Walczak

EGU23-14515 | ECS | Orals | ESSI2.8

Intaking DKRZ ESM data collections

Fabian Wachsmann

In this showcase, we present to you how Intake and its plugin Intake-ESM are utilized at DKRZ to provide highly FAIR data collections from different projects, stored on different types of storages in different formats.

The Intake Plugin Intake-ESM allows users to not only find the data of interest, but also load them as analysis-ready-like Xarray datasets. We utilize this tool to provide users with access to many available data collections at our institution from only one single access point, the main DKRZ intake catalog at www.dkrz.de/s/intake. The functionality of this package works independently of data standards and formats and therefore enables full metadata-driven data access including data processing. Intake-esm catalogs increase the FAIRness of the data collections in all aspects but especially in terms of Accessibility and Interoperability.

Started with a collection of DKRZ’s CMIP6 Data Pool, DKRZ now hosts catalogs for more than 10PB of data on different local storages. The Intake-ESM package has been well integrated into ESM data provisioning workflows.

Early sharing and making accessible: The co-developed inhouse ICON model generates an intake-esm catalog on each run.
Uptake from other technologies: E.g., intake-esm catalogs serve as templates for the more advanced DKRZ STAC Catalogs.
Making accessible all storage types: tools used for writing data to the local institutional cloud allow users to create Intake-ESM catalogs for the written data.
Data archiving: Catalogs for projects in the archive can be created from its metadata database.

For future activities, we plan to make use of new functionalities like the support for kerchunked data and the derived variable registry.

The DKRZ data management team develops and maintains local services around intake-esm for a positive user experience. In this showcase, we will present excerpts of seminars, workflows and integrations.

How to cite: Wachsmann, F.: Intaking DKRZ ESM data collections, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14515, https://doi.org/10.5194/egusphere-egu23-14515, 2023.

EGU23-14547 | Orals | ESSI2.8

PANGEO multidisciplinary test case for Earth and Environment Big data analysis in FAIR-EASE Infra-EOSC project

Marine Vernet, Erwan Bodere, Jérôme Detoc, Christelle Pierkot, Alessandro Rizzo, and Thierry Carval

Earth observation and modelling is a major challenge for research and a necessity for environmental and socio-economic applications. It requires voluminous and heterogeneous data from distributed and domain-dependent data sources, managed separately by various national and European infrastructures.

In a context of unprecedented data wealth and growth, new challenges emerge to enable inter-comparison, inter-calibration and comprehensive studies and uses of earth system and environmental data.

To this end, the FAIR-EASE project aims to provide integrated and interoperable services through the European Open Science Cloud to facilitate the discovery, access and analysis of large volumes of heterogeneous data from distributed sources and from different domains and disciplines of Earth system science.

This presentation will explain how the PANGEO stack will be used within FAIR EASE to improve data access, interpolation and analysis, but will also explore its integration with existing services (e.g. Galaxy) and underlying IT infrastructure to serve multidisciplinary research uses.

How to cite: Vernet, M., Bodere, E., Detoc, J., Pierkot, C., Rizzo, A., and Carval, T.: PANGEO multidisciplinary test case for Earth and Environment Big data analysis in FAIR-EASE Infra-EOSC project, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14547, https://doi.org/10.5194/egusphere-egu23-14547, 2023.

EGU23-14774 | ECS | Orals | ESSI2.8

meteostations-geopy: a Pythonic interface to access data from meteorological stations

Martí Bosch

Observational meteorological data is central to understanding atmospheric processes, and is thus a key requirement for the calibration and validation of atmospheric and numerical weather prediction models. While recent decades have seen the development of notorious platforms to make satellite data easily accessible, observational meteorological data mostly remains scattered through the sites of regional and national meteorological service, each potentially offering different magnitudes, temporal coverage and data formats.

In order to overcome these shortcomings, we propose meteostations-geopy, a Pythonic library to access data from meteorological stations. The central objective is to provide a common interface to retrieve observational meteorological data, therefore reducing the amount of time required to process and wrangle the data. The library interacts with APIs from different weather services, handling authentication if needed and transforming the requested information into geopandas data frames of geolocated and timestamped observations that are homogeneously structured independently of the provider.

The project is currently in an early development stage with support for two providers only. Current and future work is organized in three interrelated main axes, namely integration of further providers, implementation of native support of distributed data structures and organization of the library into the intake technical structure with drivers, catalogs, metadata sharing and plugin packages that are provider specific.

How to cite: Bosch, M.: meteostations-geopy: a Pythonic interface to access data from meteorological stations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14774, https://doi.org/10.5194/egusphere-egu23-14774, 2023.

EGU23-15964 | ECS | Orals | ESSI2.8

A novel data ecosystem for coastal analyses

Floris Calkoen, Fedor Baart, Etiënne Kras, and Arjen Luijendijk

The coastal community widely anticipates that in the next years data-driven studies are going to make essential contributions to bringing about long-term coastal adaptation and mitigation strategies at continental scale. This view is also supported by CoCliCo, a Horizon 2020 project, where coastal data form the fundamental building block for an open-web portal that aims to improve decision making on coastal risk management and adaptation. The promise of data is likely triggered by several coastal analyses that showed how the coastal zone can be be monitored at unprecedented spatial scales using geospatial cloud platforms . However, we note that when analyses become more complex, i.e., require specific algorithms, pre- and post-processing or include data that are not hosted by the cloud provider, the cloud-native processing workflows are often broken, which makes analyses at continental scale impractical.

We believe that the next generation of data-driven coastal models that target continental scales can only be built when: 1) processing workflows are scalable; 2) computations are run in proximity to the data; 3) data are available in cloud-optimized formats; 4) and, data are described following standardized metadata specifications. In this study, we introduce these practices to the coastal research community by showcasing the advantages of cloud-native workflows by two case studies.

In the first example we map building footprints in areas prone to coastal flooding and estimate the assets at risk. For this analysis we chunk a coastal flood-risk map into several tiles and incorporate those into a coastal SpatioTemporal Asset Catalog (STAC). The second example benchmarks instantaneous shoreline mapping using cloud-native workflows against conventional methods. With data-proximate computing, processing time is reduced from the order of hours to seconds per shoreline km, which means that a highly-specialized coastal mapping expedition can be upscaled from regional to global level.

The analyses mostly rely on "core-packages" from the Pangeo project, with some additional support for scalable geospatial data analysis and cloud I/O, although they can essentially be run on a standard Python Planetary Computer instance. We publish our code, including self-explanatory Juypter notebooks, at https://github.com/floriscalkoen/egu2023.

To conclude, we foresee that in next years several coastal data products are going to be published, of which some may be considered "big data". To incorporate these data products into the next generation of coastal models, it is urgently required to agree upon protocols for coastal data stewardship. With this study we do not only want to show the advantages of scalable coastal data analysis; we mostly want to encourage the coastal research community to adopt FAIR data management principles and workflows in an era of exponential data growth.

How to cite: Calkoen, F., Baart, F., Kras, E., and Luijendijk, A.: A novel data ecosystem for coastal analyses, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15964, https://doi.org/10.5194/egusphere-egu23-15964, 2023.

EGU23-16117 | ECS | Orals | ESSI2.8

Virtual aggregations to improve scientific ETL and data analysis for datasets from the Earth System Grid Federation

Ezequiel Cimadevilla, Maialen Iturbide, and Antonio S. Cofiño

The ESGF Virtual Aggregation (EVA) is a new data workflow approach that aims to advance the sharing and reuse of scientific climate data stored in the Earth System Grid Federation (ESGF). The ESGF is a global infrastructure and network of internationally distributed research centers that together work as a federated data archive, supporting the distribution of global climate model simulations of the past, current and future climate. The ESGF provides modeling groups with nodes for publishing and archiving their model outputs to make them accessible to the climate community at any time. The standardization of the model output in a specified format, and the collection, archival and access of the model output through the ESGF data replication centers have facilitated multi-model analyses. Thus, ESGF has been established as the most relevant distributed data archive for climate data, hosting the data for international projects such as CMIP and CORDEX. As of 2022 it includes more than 30 PB of data distributed across research institutes all around the globe and it is the reference archive for Assessment Reports (AR) on Climate Change produced by the Intergovernmental Panel on Climate Change (IPCC). However, explosive data growth has confronted the climate community with a scientific scalability issue. Conceived as a distributed data store, the ESGF infrastructure is designed to keep file sizes manageable for both sysadmins and end users. However, use cases in scientific research often involve calculations on datasets spanning multiple variables, over the whole time period and multiple model ensembles. In this sense, the ESGF Virtual Aggregation extends the federation capabilities, beyond file search and download, by providing out of the box remote climate data analysis capabilities over data analysis ready, virtually aggregated, climate datasets, on top of the existing software stack of the federation. In this work we show an analysis that serves as a test case for the viability of the data workflow and provides the basis for discussions on the future of the ESGF infrastructure, contributing to the debate on the set of reliable core services upon which the federation should be built.

Acknowledgements

This work it’s been developed under support from IS-ENES3 which is funded by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 824084.

This work it’s been developed under support from CORDyS (PID2020-116595RB-I00) funded by MCIN/AEI/10.13039/501100011033.

How to cite: Cimadevilla, E., Iturbide, M., and Cofiño, A. S.: Virtual aggregations to improve scientific ETL and data analysis for datasets from the Earth System Grid Federation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16117, https://doi.org/10.5194/egusphere-egu23-16117, 2023.

EGU23-17029 | Orals | ESSI2.8

Establishing a Geospatial Discovery Network with efficient discovery and modeling services in multi-cloud environments

Campbell Watson, Hendrik Hamann, Kommy Weldemariam, Thomas Brunschwiler, Blair Edwards, Anne Jones, and Johannes Schmude

The ballooning volume and complexity of geospatial data is one of the main inhibitors for advancements in climate & sustainability research. Oftentimes, researchers need to create bespoke and time-consuming workflows to harmonize datasets, build/deploy AI and simulation models, and perform statistical analysis. It is increasingly evident that these workflows and the underlying infrastructure are failing to scale and exploit the massive amounts of data (Peta and Exa-scale) which reside across multiple data centers and continents. While there have been attempts to consolidate relevant geospatial data and tooling into single cloud infrastructures, we argue that the future of climate & sustainability research relies on networked/federated systems. Here we present recent progress towards multi-cloud technologies that can scale federated geospatial discovery and modeling services across a network of nodes. We demonstrate how the system architecture and associated tooling can simplify the discovery and modeling process in multi-cloud environments via examples of federated analytics for AI-based flood detection and efficient data dissemination inspired by AI foundation models.

How to cite: Watson, C., Hamann, H., Weldemariam, K., Brunschwiler, T., Edwards, B., Jones, A., and Schmude, J.: Establishing a Geospatial Discovery Network with efficient discovery and modeling services in multi-cloud environments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17029, https://doi.org/10.5194/egusphere-egu23-17029, 2023.

EGU23-17494 | Orals | ESSI2.8

Enabling simple access to a data lake both from HPC and Cloud using Kerchunk and Intake

Thierry Carval, Erwan Bodere, Julien Meillon, Mathiew Woillez, Jean Francois Le Roux, Justus Magin, and Tina Odaka

We are experimenting with hybrid access from Cloud and HPC environments using the Pangeo platform to make use of a data lake in an HPC infrastructure “DATARMOR”. DATARMOR is an HPC infrastructure hosting ODATIS services (https://www.odatis-ocean.fr) situated at “Pôle de Calcul et de Données pour la Mer” in IFREMER. Its parallel file system has a disk space dedicated for shared data, called “dataref”. Users of DATARMOR can access these data, and some of those data are cataloged by sextant service (https://sextant.ifremer.fr/Ressources/Liste-des-catalogues-thematiques/Datarmor-Donnees-de-reference ) and is open and accessible from the internet, without duplicating the data.

In the cloud environment, the ability to access files in a parallel manner is essential for improving the speed of calculations. The Zarr format (https://zarr.readthedocs.io) enables parallel access to data sets, as it consists of numerous chunked “object data” files and some “metadata” files. Although it enables multiple data access, it is simple to use since all the collections of data stored in a Zarr format are accessible through one access point.

For HPC centers, the numerous “object data” files create a lot of metadata on parallel file systems, slowing the data access time. Recent progress on development of Kerchunk (https://fsspec.github.io/kerchunk/), which recognize the chunks in a file (e.g. NetCDF / HDF5) as a Zarr chunk and its capability to recognize a series of files as one Zarr file, is solving these technical difficulties in our PANGEO use cases at DATARMOR. Thanks to Kerchunk and Intake (https://intake.readthedocs.io/) it is now possible to use different sets of data stored in DATARMOR in an efficient and simple manner.

We are further experimenting with this workflow using the same use cases on the PANGEO-EOSC cloud. We make use of the same data stored at the data lake in DATARMOR, but based on Kerchunk and Intake catalog through ODATIS access, without duplicating the source data. In the presentation we will share our recent experiences from these experiments.

How to cite: Carval, T., Bodere, E., Meillon, J., Woillez, M., Le Roux, J. F., Magin, J., and Odaka, T.: Enabling simple access to a data lake both from HPC and Cloud using Kerchunk and Intake, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17494, https://doi.org/10.5194/egusphere-egu23-17494, 2023.

EGU23-2047 | Posters on site | ESSI2.9

Cross Domain Interoperability in Science Project 9 of EOSC Future WP6.3

Hilde Orten and Bodil Agasøster

EGU23-4802 | Posters on site | ESSI2.9

The journey towards FAIR: A story from the marine domain

Peter Thijsse, Thierry Carval, Alexandra Kokkinaki, Justin Buck, Gwenaelle Moncoiffe, Erwann Quimbert, Guillaume Alviset, and Dick Schaap

The marine subdomain consists of a diverse data landscape, with several Research Infrastructures (RIs) involved. In the ENVRI-FAIR project the marine domain is represented by Euro-ARGO, ICOS (Marine), EMSO, and LifeWatch (Marine) as RIs as listed on the ESFRI roadmap, and SeaDataNet as European marine data management infrastructure. The overarching goal of ENVRI-FAIR is that all participating research infrastructures (RIs) will improve their level of FAIRness and become ready for connecting their data repositories and services to the European Open Science Cloud (EOSC).

To achieve this goal, the marine domain partners have first analysed and assessed the FAIRNess level of each participating RI and identified the necessary actions to improve their individual FAIRness. They then created a roadmap and implementation plan that led to the development of an Essential Ocean Variable (EOV) demonstrator, highlighting the FAIRness achievements.

In this presentation we will discuss in more detail the RI FAIR assessment and analysis process and show how it evolved alongside the evolution of the FAIR assessment tools themselves and the harmonisation required to be meaningful and useful. We will then walk you through the analysis of this assessment that led to a list of strengths and weaknesses per RI, and the solutions to overcome the weaknesses for each individual of the FAIR principle. Finally, we will present the outcomes of the required upgrades, adoptions of standards, improvements, developments and services that were developed as part of the implementation plan and led to the construction of the EOV demonstrator product.

How to cite: Thijsse, P., Carval, T., Kokkinaki, A., Buck, J., Moncoiffe, G., Quimbert, E., Alviset, G., and Schaap, D.: The journey towards FAIR: A story from the marine domain, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4802, https://doi.org/10.5194/egusphere-egu23-4802, 2023.

EGU23-4809 | Orals | ESSI2.9

The CMIP TT Data Citation for a sustainable continuation of a requested and established service

Martina Stockhause and Sasha Ames

Data citation has become a core service for data infrastructures and data repositories, integrated in author guidelines and project proposals. It provides an essential piece in Open Science supporting reuse and enabling credit by integration into the established scientific credit system of scientific references (Scholix) and the envisioned FAIR Digital Object framework. Humans and machines need to be supported for provision and access of the information.

Modern data infrastructures are federated and without central funding, especially that of international collaborations. Providing and maintaining state-of-the-art data services does not fit within current funding structures, even though these infrastructures provide core services to a broad community, like the data infrastructure of the Coupled Model Intercomparison Project (CMIP).

For the citation service of CMIP6, currently a single person effort, the newly established CMIP Task Team (TT) Data Citation is exploring a set of options for a sustainable data citation service for CMIP7 and further WCRP activities in a cost-benefit approach. Due to the different experiences of its members, the TT brings infrastructure aspects together with the needs of researchers. A governance structure for sustainability, standard agreements and the coordination of further developments will be framed. The background to the TT and first experiences are shared.

How to cite: Stockhause, M. and Ames, S.: The CMIP TT Data Citation for a sustainable continuation of a requested and established service, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4809, https://doi.org/10.5194/egusphere-egu23-4809, 2023.

EGU23-7708 | Orals | ESSI2.9

The ENVRI-Hub as a service for accelerating FAIRification of the Environment Domain Research Infrastructures

Andreas Petzold, Ulrich Bundke, Chris Schleiermacher, Ana Rita Gomes, Katrin Seemeyer, Angeliki Adamaki, Alex Vermeulen, Zhiming Zhao, Damien Boulanger, Thierry Carval, and Anca Hienola

European Environmental Research Infrastructures (ENVRIs) on the ESFRI level are core facilities for providing data, research products and services from the four subdomains of Earth system science – Atmosphere, Marine, Solid Earth, and Biodiversity/Terrestrial Ecosystems. The ENVRI Cluster represents the core component of the European environmental research infrastructure landscape, with the ENVRI community as their common forum for collaboration and co-creation. The topics covered by the ENVRIs span the entire range of scientific objectives relevant for Earth system monitoring.

The community has developed the ENVRI-Hub as a central platform for accessing interdisciplinary FAIRfied environmental research assets, serving as an essential ENVRI community's interface to the European Open Science Cloud (EOSC). Through the ENVRI-Hub, the ENVRI community shares their FAIRness experience, technologies, and training as well as research products and services. The architecture and functionalities of the ENVRI-Hub are driven by scientific applications, use cases and user needs. Its three main pillars are the ENVRI Knowledge Base as the human interface to the ENVRI ecosystem, the ENVRI Catalogue of Services as the machine-actionable interface to the ENVRI ecosystem, and finally, subdomain and cross-domain scientific use cases as demonstrators for the capabilities of service provision among ENVRIs and across Science Clusters.

The Science Demonstrators are being developed by several RIs in parallel. They are the key product to express the ENVRI-Hub’s potential regarding easy access to metadata and services, data discovery, as well as the promotion of interoperability in science across sub-domains. Science Demonstrators are built with Jupyter Notebooks - an open-source web application that allows one to create and share documents that contain live code, equations, visualizations, and narrative text. Uses include cross domain data access, data cleaning and transformation, numerical simulation, statistical modelling, data visualization, machine learning, and much more. The Jupyter Notebook environment forms the nucleus of the future ENVRI Virtual Research Environment.

The ENVRI Science Demonstrators and Science Projects in the Horizon 2020 project EOSC Future aim at demonstrating how joint projects can address major challenges for Europe’s societies and how research infrastructures can support Horizon Europe’s missions within the EOSC. Presented Science Demonstrators cover one ENV domain wide service on the collocation of sampling sites, and two science cases from atmospheric and marine research, respectively.

Acknowledgement: ENVRI-FAIR has received funding from the EU Horizon 2020 research and innovation programme under grant agreement No 824068. Part of the work is funded by the EU Horizon 2020 project EOSC Future under grant agreement No 101017536. This work is only possible with the collaboration of the ENVRI-FAIR partners and thanks to the joint efforts of the whole ENVRI-Hub team.

How to cite: Petzold, A., Bundke, U., Schleiermacher, C., Gomes, A. R., Seemeyer, K., Adamaki, A., Vermeulen, A., Zhao, Z., Boulanger, D., Carval, T., and Hienola, A.: The ENVRI-Hub as a service for accelerating FAIRification of the Environment Domain Research Infrastructures, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7708, https://doi.org/10.5194/egusphere-egu23-7708, 2023.

EGU23-8691 | Posters on site | ESSI2.9

The GUISDAP analysis software in Jupyter notebook as a tool for the FAIRness of the current EISCAT data

Maria Mihalikova and Ingemar Häggström

The present era of rapid technological advances creates a challenge for data providers and scientists to create and maintain FAIR data and services not just for future operations but also for historical data gathered and analysed with technologies that are slowly phasing out of their usage. GUISDAP is an open source software package, written in Matlab, C and Fortran and provided and maintained by EISCAT, for analysis and visualisation of its incoherent scatter radar data as well as for some other radars in the world. As the new incoherent radar system (EISCAT_3D) is built, the current mainland radar systems will cease their operations. Thus, the development and maintenance of the compatibility of GUISDAP with newer computational technologies will become more challenging.

One way how to preserve GUISDAP operability and accessibility by the user community is to make it accessible through a Jupyter notebook docker deployment through EISCAT resources and in the frame of an EOSC project. This will help to ensure the FAIRness of historical EISCAT data by providing tools for reanalysis and visualisation that will be accessible by any potential EISCAT user. This poster will present our project with GUISDAP in a Jupyter notebook environment.

How to cite: Mihalikova, M. and Häggström, I.: The GUISDAP analysis software in Jupyter notebook as a tool for the FAIRness of the current EISCAT data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8691, https://doi.org/10.5194/egusphere-egu23-8691, 2023.

EGU23-8784 | ECS | Posters on site | ESSI2.9

A European Dashboard showcasing the State of the Environment

Angeliki Adamaki, Alex Vermeulen, Dick Schaap, Peter Thijsse, Tjerk Krijger, Raul Bardaji, Andreu Fornos, and Ivan Rodero

The European Open Science Cloud is a pan-European initiative that aims at developing a federated infrastructure that supports scientific research and Open Science. Having the FAIR principles at the core of Open Science practices, the Environmental Research Infrastructures (ENVRIs) that operate in Europe collaborate at cluster level (the ENVRI cluster) continuously improving the FAIRness of their services and working towards becoming more interoperable among them but also with other clusters. To showcase the benefits of an integration platform that supports scientific workflows, the ENVRIs develop the “Dashboard for the State of the Environment” as a cross-discipline service to address with scientific facts the environmental concerns. The project brings together three scientific domains (Atmosphere, Biodiversity, Ocean) that each have set up analytical workflows to provide environmental indicators in real-time, allowing the users to visualise the “State of the Environment” by interacting with the service interface.

The Dashboard is designed to be completely user configurable so that the users can select from a list the indicators to be shown and their order. Providers can add, remove and edit indicators through a standard REST API, that allows transferring all parameters, including the configuration of the indicators and how to provision data values and thumbnail interaction. The Dashboard is implemented and operated using engineering best practices, including YAML for the indicators’ descriptions and a robust and flexible container-based deployment. It builds on EOSC services like AAI, cloud services, and data storage, and the workflows that provide the indicators will also build on the EOSC and Research Infrastructure (RI) computing integration. As a proof of concept, a limited list of indicators is available, and we foresee that the participating RIs will provide many more indicator options in the near future. In addition, through the extension API it will be possible for new RIs to start providing indicators.

The Dashboard service is completely open source, and, as the whole concept, it is designed to be flexible and expansible. Therefore, we encourage other clusters that are part of EOSC to use the service as another basis for disseminating their relevant indicators to a wider audience.

How to cite: Adamaki, A., Vermeulen, A., Schaap, D., Thijsse, P., Krijger, T., Bardaji, R., Fornos, A., and Rodero, I.: A European Dashboard showcasing the State of the Environment, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8784, https://doi.org/10.5194/egusphere-egu23-8784, 2023.

EGU23-9235 | Posters on site | ESSI2.9

Long-term Reproducibility for Jupyter Notebook

Claudio D'Onofrio, Karolina Pantazatou, Ute Karstens, Margareta Hellström, Ida Storm, and Alex Vermeulen

Computational notebooks (e.g. Jupyter notebook) are a popular choice for interactive scientific computing to convey descriptive information together with executable source code. The user can annotate the scientific development of the work, the methods applied, describe ancillary data or the analysis of results, with text, illustrations, figures, and equations. Such ‘executable’ documents provide a paradigm shift in scientific writing, where not only the science is described, but the actual computation and source code are openly available and can be reproduced and validated.

Therefore, it is of paramount importance to preserve these documents. A unique and persistent identification (PID) is essential together with providing enough information to execute the source code. Generating a PID for a Jupyter notebook is not technically challenging. We can automatically collect system and run-time information and, with a guided workflow for the user, assemble a rich set of metadata. The collected information allows us to recreate the computational environment and run the source code, which in return (theoretically) should produce the same results as published.

The importance of providing a rich set of metadata for all digital objects in a human readable and machine actionable form is well understood and widely accepted as necessity for reproducibility, traceability, and provenance. This is reflected in the FAIR principles (Wilkinson, https://doi.org/10.1038/sdata.2016.18) which are regarded as gold standard by many scientific communities.

Pimentel et al. (https://doi.org/10.1109/MSR.2019.00077) analysed over 800’000 Jupyter notebooks from GitHub. 24 % executed without errors and only 4 % produced the same results. The likelihood to successfully compile and run a decade old source code is slim. Long term support for well established operating systems varies between 5 to 10 years, user software support is usually shorter and looking at free and open-source repositories there is often no support (or best effort) offered.

We present an approach to safely reproduce the computational environment in the future with a focus on long-term availability. Instead of trying to reinstall the computational environment based on the stored metadata, we propose to archive the docker image, the user space (user installed packages) and finally the source code. Recreating the system in this way is more like restoring a backup, where backup is the equivalent of an entire computer system. It does not solve all the problems but removes a great deal of complexity and uncertainty.

Though there are shortcomings in our approach, we believe our solution will lower the threshold for scientists to provide rich meta data, code and results attached to a publication that can be reproduced in the far future.

How to cite: D'Onofrio, C., Pantazatou, K., Karstens, U., Hellström, M., Storm, I., and Vermeulen, A.: Long-term Reproducibility for Jupyter Notebook, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9235, https://doi.org/10.5194/egusphere-egu23-9235, 2023.

EGU23-9906 | Posters on site | ESSI2.9

Addressing pressing global societal research challenges through targeted harmonisation of macrosystems ecology data sets

Beryl Morris, Werner Kutsch, Michael SanClements, Henry Loescher, Melissa Genazzio, Michael Mirtl, Jaana Back, Tommy Bornman, Paula Mabee, Xiubo Yu, Steffan Zacharias, Gregor Feig, Mark Grant, Emmanuel Salmon, and Leiming Zhang

Guided by the Framework Criteria of the Group of Senior Officials (GSO) on Global Research Infrastructures, 6 major ecosystem research infrastructures (SAEON/South Africa, TERN/Australia, CERN/China, NEON/USA, ICOS/Europe, eLTER/Europe) came together in 2020 under an MOU, establishing the Global Ecosystem Research Infrastructure (GERI). With its goal of providing interoperable data and services based on terrestrial and coastal in-situ observations from a high number of observational sites, organized in a common hierarchical system and standardized in the highest possible degree, GERI provides a unique opportunity to advance our understanding of ecological processes across continents, decades, and disciplinary boundaries.

Aggregating ecological and biogeophysical data is complex. Not only do those data cover a myriad of different types of natural phenomena, and the interactions between, but the data are generated in different jurisdictions using different standards and approaches. Thus, as a critical first step in understanding the challenges and potential of its multi-institutional, multi-country data landscape, GERI has identified and mapped all the data types from each of its members, grouping the data suites provided by each GERI member into the drivers of changes (causes) and the ecological processes (effects) and then visualizing it into broad (searchable) respective common ecological categories.

This exercise has allowed evaluation of the potential for a targeted data harmonization effort based on a subset of data products with high relevance to specific use cases (e.g., drought across multiple continents). By using the subset of relevant data as a community test case, GERI can ascertain the efficacy of a specific harmonized data set in advancing a priority area of science. Such a prototype will set the stage for future efforts and ensures GERI addresses the most pressing global research challenges, i.e. those frontiers of knowledge where a global-critical-mass effort is required to achieve progress.

How to cite: Morris, B., Kutsch, W., SanClements, M., Loescher, H., Genazzio, M., Mirtl, M., Back, J., Bornman, T., Mabee, P., Yu, X., Zacharias, S., Feig, G., Grant, M., Salmon, E., and Zhang, L.: Addressing pressing global societal research challenges through targeted harmonisation of macrosystems ecology data sets, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9906, https://doi.org/10.5194/egusphere-egu23-9906, 2023.

EGU23-10537 | Orals | ESSI2.9

The benefits of Big Data Cooperation from a single research infrastructure viewpoint – ICOS and the ENVRI-hub in EOSC

Werner Leo Kutsch, Alex Vermeulen, and Margareta Hellström

The Integrated Carbon Observation System (ICOS), is a distributed European Research Infrastructure which provides high-precision and highly standardised observations from more than 170 stations from three domains: Atmosphere, Ecosystem and Ocean. ICOS covers currently 16 European countries. All ICOS data is made available by the ICOS Carbon Portal, first in near-real time (within 24h when possible), and after further quality control as domain specific annual releases. The data flow follows the Findable, Accessible, Interoperable, Reusable (FAIR) principles.

ICOS has been cooperating with other European Environmental Research Infrastructures (the ENVRI Community) since more than a decade. ICOS used the ENVRI Reference Model to set up its data life cycle, developed common metadata and data citation strategies and most recently contributed to the ENVRI-hub within the European Open Science Cloud (EOSC). The ENVRI-hub will be a central gateway to environmental data and services offered by the European environmental research infrastructures. The data offered through the hub will be interoperable across the Earth system disciplines and therefore easy to use for interdisciplinary environmental research. Data will be open and free to use by anyone. Users of the ENVRI-hub will be also able to use the Virtual Research Environments and do their science computing directly inside the hub.

The synergies achieved through cooperation within the ENVRI community has saved a lot of resources for ICOS when implementing its data life cycle. Furthermore, it has created common services for scientists working on complex Earth System questions such as climate-biodiversity or climate-air-quality feedbacks.

How to cite: Kutsch, W. L., Vermeulen, A., and Hellström, M.: The benefits of Big Data Cooperation from a single research infrastructure viewpoint – ICOS and the ENVRI-hub in EOSC, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10537, https://doi.org/10.5194/egusphere-egu23-10537, 2023.

EGU23-10908 | Orals | ESSI2.9

Cross disciplinary collaboration for societal benefit across Australian Research Infrastructure Networks

Tim Rawling, Beryl Morris, Andre Zerger, and Rebecca Farrington

Earth systems including the geosphere, biosphere, cryosphere, hydrosphere and atmosphere complexly interact to create a planet on which life and humanity has thrived. Throughout history we have studied and observed all these systems, but too often there has been too little strategic integration of activities supporting observing and monitoring, storage and access to data, computing and modelling, and predicting future states.

This has reduced our ability to quickly advance our understanding of the interdependence of these complex systems, particularly as data has often been collected in isolation, or to address a very specific research problem. The challenges to achieving such an integrated perspective are scientific, technical, social, political and organisational.

In order to address this issue the Earth and environment related research infrastructure facilities in Australia have self-organised to create the National Earth and Environmental Sciences Facilities Forum (NEESFF). NEESFF is intended to harness the capacity of its environmentally-focused capabilities to collectively create solutions and deliver the information needed for sustainable development and use of environmental resources. NEESFF’s vision is an effective and coordinated response to global environmental conditions in a uniquely Australian context.

Many NEESFF organisations are funded through the National Research Infrastructure Infrastructure Strategy (NCRIS) which supports research activity across many STEM and HASS disciplines. The Australian Government recently developed a Roadmap outlining the Challenges, opportunities for system enhancement and potential step-changes that the next phase of research infrastructure investment will drive. These too are guided by national and global science and societal challenges and rely on underpinning robust and integrated FAIR open data systems across the RI network.

Here we will outline the national approach to these challenges, the current thinking on the data challenges that we face in an Australian context and the first steps we are taking as a community to develop a framework for the delivery of integrated Earth data.

Members are: Atlas of Living Australia (ALA); AuScope; Australian Urban Research Infrastructure Network (AURIN); Australian Research Data Commons; Australian Plant Phenomics Facility (APPF); BioPlatforms Australia; Geoscience Australia (GA); Integrated Marine Observing System (IMOS); Australian Terrestrial Ecosystem Research Network (TERN); Marine National Facility; Bureau of Meteorology; E2SIP (CSIRO); National Computational Infrastructure; and AARNet.

How to cite: Rawling, T., Morris, B., Zerger, A., and Farrington, R.: Cross disciplinary collaboration for societal benefit across Australian Research Infrastructure Networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10908, https://doi.org/10.5194/egusphere-egu23-10908, 2023.

EGU23-11193 | Orals | ESSI2.9

Ensuring international and interdicipilinary interoperability of research data working groups

Ari Asmi

Interoperable tools for science is a great goal. Vision of seamless, shared and transparent scientific workflows, which can easily implement and automate processes across scientific fields and geographical domains gives hope for solving key scientific and societal challenges.

However, we are not there yet. A lot of advances have been made, but there is no global agreement or a set of standards which enable such interoperability. Regional and national initiatives push towards diciplinary or regional open science environments, sharing some of the features of such true global interoperability, but often work in isolation - leading to danger of even more divergence of solutions.

Science is global, and such discussions should be global. Scientific processes cover wide range of diciplines, practices and (science) cultural borders, and thus such solutions should be also shared across these boundaries. Modern science needs many kinds of experts, from traditional scientists to data stewards, software engineers and even policy developers and legal experts - and thus the interopebaility need to be considered also across the expert groups as well.

In this presentation I will present some of the tools currently available for such international development, specifically concentrating on the development and expertise from (now a decade-old!) Research Data Alliance. I will also present several other initiatives from regional (particularly European) and global importance, and discuss the ways a researcher or a research infrastructure developer could interact with them.

How to cite: Asmi, A.: Ensuring international and interdicipilinary interoperability of research data working groups, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11193, https://doi.org/10.5194/egusphere-egu23-11193, 2023.

EGU23-11400 | Orals | ESSI2.9

EOSC-FUTURE – ENVRI-FAIR Science Project Environmental Indicators – Ocean Component – Beacon API

Tjerk Krijger, Peter Thijsse, Dick Schaap, and Robin Kooyman

In the EOSC-Future project, ENVRI-FAIR partners are involved in developing two Science Projects (SPs), one about Invasive Species, and one about a Dashboard of the State of the Environment. The Dashboard should provide easy means to users to determine the state of the environment and follow trends of our Earth system for a selected number of parameters within the Earth components of Atmosphere, Ocean, and Biodiversity.

MARIS leads the development of the Ocean component in cooperation with IFREMER, OGS and NOC-BODC. It consists of a Map Viewer that displays in-situ measurements of selected Essential Ocean Variables (EOVs), namely Temperature, Oxygen, Nutrients and pH. These measurements are retrieved from a datalake of selected Blue Data Infrastructures (BDIs) such as Euro-Argo and SeaDataNet CDI using tailor-made APIs for fast sub-setting at data level.

Performance is a major challenge as users should not wait too long for retrieving and displaying the data in the Map Viewer following their selection criteria, while the origin data is organised in millions of observation files which make it hard to achieve fast responses. At MARIS there have been developments ongoing to create a system called beacon API with unique indexing system that could, on the fly, extract the specific data that was requested from millions of observation datafiles that contain multiple parameters in diverse units. We show the possibilities of this API by applying it to the SeaDataNet CDI database. The system is built in a way that it can be applied to other BDIs in the future as well and does not only function on the SeaDataNet CDI architecture.

The user interface of the map viewer is designed for (citizen) scientists and allows them to interact with the large data collections retrieving parameter values from observation data by geographical area and using sliders for date, time and depth. The in-situ values are co-located with product layers from Copernicus Marine, based upon modelling and satellite data. These in-situ data sets are also used in algorithms to generate aggregated values as dynamic trend indicators for sea regions. These are displayed at the Environmental Indicators dashboard and provide ocean trend indicators for the selected EOVs for designated areas. For more information, users can then click on such an indicator guiding them to the Map Viewer to browse deeper into the data and details facilitating the trends.

How to cite: Krijger, T., Thijsse, P., Schaap, D., and Kooyman, R.: EOSC-FUTURE – ENVRI-FAIR Science Project Environmental Indicators – Ocean Component – Beacon API, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11400, https://doi.org/10.5194/egusphere-egu23-11400, 2023.

EGU23-11716 | ECS | Orals | ESSI2.9

Generating geospatial data products of ecosystem structure from LiDAR using Notebook-as-a-VRE (NaaVRE)

Yifang Shi, Spiros Koulouzis, Riccardo Bianchi, Joris Timmermans, W. Daniel Kissling, and Zhiming Zhao

Quantifying ecosystem structure is of great importance for forest management, ecology, biodiversity monitoring, and climate change modeling. Advances in remote sensing — specifically Light Detection And Ranging (LiDAR) — have enabled the mapping of vegetation structure with unprecedented detail. However, considerable effort and advanced technical skills are required for researchers to process massive amounts of LiDAR data, giving the challenges in handling big data and high computational costs. Different requirements from end users also indicate that the FAIRness (i.e. Findability, Accessibility, Interoperability, and Reusability) of the processing workflow is needed for a broad user community. In this context, we developed a virtual research environment (VRE) solution for the Jupyter environment named Notebook-as-a-VRE, which allows users to search research assets (e.g. data, algorithms), compose workflows, manage the lifecycle of an experiment, and share the results among the user community. Functional components, including the component containerizer, the experiment manager, the VRE knowledge base, and the semantic search engine were deployed as Jupyter extensions on the user environment. In this way, users can encapsulate and containerize selected cells from Jupyter Notebook as standardized RESTful API services, use them for their customized workflows and publish the containerized cells or workflows as reusable components via community repositories. A high-throughput workflow called ‘Laserfarm’ was implemented in the NaaVRE for deriving geospatial data products of ecosystem structure at high resolution across the Netherlands. Geospatial data products containing 25 LiDAR-derived metrics were generated at 10 m resolution covering the whole Netherlands, representing open data on ecosystem height, ecosystem cover, and ecosystem structural complexity. The demonstrated NaaVRE solution can be flexibly expanded to other use cases in ecology, biodiversity, and the Earth science domain, with potential contributions to newly emerging national and regional biodiversity observation networks.

How to cite: Shi, Y., Koulouzis, S., Bianchi, R., Timmermans, J., Kissling, W. D., and Zhao, Z.: Generating geospatial data products of ecosystem structure from LiDAR using Notebook-as-a-VRE (NaaVRE), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11716, https://doi.org/10.5194/egusphere-egu23-11716, 2023.

EGU23-12686 | Posters on site | ESSI2.9

Advancing frontier knowledge of the solid earth by providing access to integrated and customized services: the Geo-INQUIRE project

Fabrice Cotton, Angelo Strollo, Helle Pedersen, Helen Crowley, Stefan Wiemer, Florian Haslinger, Marc Urvois, Jean Schmittbuhl, Stefano Lorito, Andrey Babeyko, Daniele Bailo, Jan Michalek, Otto Lange, Javier Quintero, Gaetano Festa, Shane Murphy, Mariusz Majdanski, Iris Christadle, Mateus Prestes, and Stefanie Weege

Modern scientific endeavours already have the capacity to call upon a vast variety of data, often in huge volumes. However, the challenge is not only how to make the most of such a resource, but also how to make it available to the wider scientific community, especially for encouraging curiosity-driven research. Fifty-one institutions from 13 countries are currently working together in the Geo-INQUIRE (Geosphere INfrastructure for QUestions into Integrated REsearch) project.

The main goal of this new project is to enhance, give access to, and make interoperable, key datasets of the Geoscience community. This will include "big" data streams and high-performance computing codes which are critical to studying the temporal variation of the solid Earth, forecasting multi-hazards, evaluating Georesources and the analysis of the interface between the solid Earth as well as oceans and atmosphere. About 150 access points – both on-site and virtually are involved. Transnational Access (TA, both virtual and on-site) will be provided at six test beds across Europe: the Bedretto Laboratory, Switzerland; the Ella-Link Geolab, Portugal; the Liguria-Nice-Monaco submarine infrastructure, Italy/France; the Irpinia Near-Fault Observatory, Italy; the Eastern Sicily facility, Italy; and the Corinth Rift Laboratory, Greece.

Several European Research Infrastructure Consortia take part, namely the European Plate Observing System (EPOS) for solid Earth and geodynamics observations, the European Multidisciplinary Seafloor and Water Column Observatory (EMSO) for deep-sea and coastal observations, and ECCSEL for CO₂ capture, utilization, transport, and storage, and geoenergy. This 16 million Euro project started in October 2022, within the Horizon Europe Infrastructure program of the European Union.

The presentation will briefly describe the project and give examples of curiosity-driven research topics which will be made possible through such a multi-disciplinary project. We will finally present the challenges and efforts made to comply with FAIR principles and accompany the dissemination of the data with innovative and cross-disciplinary training activities.

How to cite: Cotton, F., Strollo, A., Pedersen, H., Crowley, H., Wiemer, S., Haslinger, F., Urvois, M., Schmittbuhl, J., Lorito, S., Babeyko, A., Bailo, D., Michalek, J., Lange, O., Quintero, J., Festa, G., Murphy, S., Majdanski, M., Christadle, I., Prestes, M., and Weege, S.: Advancing frontier knowledge of the solid earth by providing access to integrated and customized services: the Geo-INQUIRE project, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12686, https://doi.org/10.5194/egusphere-egu23-12686, 2023.

EGU23-14294 | Posters on site | ESSI2.9

Challenges of integrating large infrastructures using the example of ENES-CDI and EOSC

Hannes Thiemann, Heinrich Widmann, Stephan Kindermann, Fanny Adloff, and Justus von Brandt

Climate Change is one of the most pressing global challenges in which researchers from around the world and from a wide range of disciplines are working together. This requires infrastructures that enable both local and cross-border cooperation.

Within IS-ENES, the Infrastructure for Earth System Modelling, European partners from the areas of climate modelling, computational science, data management, climate impacts and climate services are working together, to deliver a research infrastructures to provide access to climate model data and tools to boost the understanding of past, present and future climate variability and changes. Core element of the infrastructure is the Earth System Grid Federation (ESGF), which is operated in a global partnership.

In the Horizon Europe project FAIRCORE4EOSC, the German Climate Computing Center (DKRZ) is involved in the design of further services in the European Open Science Cloud, that also meet the requirements of the IS-ENES community and is particularly dedicated to examining and testing the possibility of integrating EOSC and IS-ENES services.

FAIRCORE4EOSC focuses on the development and realization of further core components for the European Open Science Cloud (EOSC). Leveraging existing technologies and services, the project will develop nine new EOSC-Core components aimed at improving the discoverability and interoperability of an increased amount of research outputs. In particular, IS-ENES checks this for selected data collections that have high scientific relevance for both data producers (the climate modellers) and data users from other research disciplines and will be available in the long term, such as those used for the IPCC reports,

These data collections and associated scientific entities (such as scientific projects) will be identified and will receive identifiers using Kernel Information Types as well as entries in the Data Type Registry with its corresponding contents. The consideration of the different aggregation levels, which is decisive for the re-use, is taken into account and a PID Graph will be utilized to enhance and simplify the re-use of data. Where appropriate, ‘Research Activity identifiers’ (RAiDs) will be assigned to projects and experiments, providing (domain agnostic) users with an aggregated view on the entities (data, software, people involved, etc.pp.) of the scientific project.

Crucial is the interlinking of service metadata with data collection metadata in the PID graph and EOSC research discovery graph to enable advanced discovery support for the ENES community. Additionally, the generated provenance records will be extended to include DOI and data citation info thus improving the reusability of derived data products in interdisciplinary research contexts. In the latter context, IS-ENES will use also the Metadata Crosswalk Registry (MSCR) to improve interoperable reuse of ENES data by impact communities through providing crosswalks from vocabularies for climate variables (e.g. the CF conventions) to ontologies understandable and interpretable by other communities.

This talk will provide an overview of the current status of the implementation and its long-term benefits for researchers in IS-ENES and beyond and will highlight some challenges related to the integration of various large infrastructures.

How to cite: Thiemann, H., Widmann, H., Kindermann, S., Adloff, F., and von Brandt, J.: Challenges of integrating large infrastructures using the example of ENES-CDI and EOSC, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14294, https://doi.org/10.5194/egusphere-egu23-14294, 2023.

EGU23-14628 | ECS | Orals | ESSI2.9

Implementing a European Big Copernicus Data Analytics platform: The C-SCALE service offer in a nutshell

Charis Chatzikyriakou, Zdeněk Šustr, Enol Fernández, Björn Backeberg, Sebastian Luna-Valero, Magdalena Brus, Xavier Salazar Forn, Christian Briese, and Diego Scardaci

The EC H2020 C-SCALE (Copernicus - eoSC AnaLytics Engine, https://c-scale.eu) project implements a European open source Big (Copernicus) Data Analytics platform by federating the best-of-breed tools, competences and services by collaboratively building on the experience and competences of pan-European e-Infrastructures and existing project initiatives.

The vision of the project is to empower European researchers, institutions and initiatives to easily discover, access, process, analyse and share Copernicus data, tools, resources and services through the EOSC Portal. To this end, C-SCALE delivers a federated compute and data infrastructure offering Copernicus and Earth Observation (EO) data, including a seamless user experience where the complexity of resource provisioning and orchestration is abstracted away from the end-user. The service offer of C-SCALE includes four main services:

The Federated Earth System Simulation and Data Processing Platform (FedEarthData) service brings together the providers of data and processing capacity, so that EO products held in distributed archives across the federation can be easily discovered and seamlessly accessed and processed on batch as well as on interactive analytic platforms deployed on distributed computing resources anywhere across the federation.
The Metadata Query Service (MQS) makes Copernicus data distributed across partners within the federation discoverable and searchable. It is a STAC-compliant API that redistributes incoming queries among the federated sites and provides a consolidated response containing the list of aggregated results. The MQS exposes all STAC collections available within the federation on a single endpoint and provides a search interface that accepts the core parameters of the STAC API Item Search specification.
The openEO platform service provides intuitive programming libraries alongside with a large EO data repository to simplify processing and data management. This large-scale data access and computation is performed on multiple infrastructures allowing use cases from exploratory research to large-scale production of EO-derived maps and information in an accelerated way.
The Workflow Solutions are easily deployable workflows supporting monitoring, modelling and forecasting of the Earth system. They provide adaptable templates and examples, in the form of Jupyter Notebooks, of Copernicus and EO data and analysis workflows enabling users to more easily arrange a processing pipeline to create results on the C-SCALE federation.

In addition to these services, the project delivers the C-SCALE community, for the engagement with existing and new stakeholders, including both researchers and service providers in Earth Observation, documentation and training material.

By the aforementioned services, the project aims to scale up the EOSC Portal through a continuously growing catalogue of services and resources supporting the whole research life cycle and enabling more scientific communities to access state-of-the-art services for their research activities. In addition, C-SCALE facilitates synergies between pan-European e-infrastructures operators, leading to harmonised services, improved use of resources and economies of scale.

How to cite: Chatzikyriakou, C., Šustr, Z., Fernández, E., Backeberg, B., Luna-Valero, S., Brus, M., Salazar Forn, X., Briese, C., and Scardaci, D.: Implementing a European Big Copernicus Data Analytics platform: The C-SCALE service offer in a nutshell, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14628, https://doi.org/10.5194/egusphere-egu23-14628, 2023.

EGU23-15749 | Posters on site | ESSI2.9

The WorldFAIR project: enabling global interdisciplinary cooperation on integrating FAIR Data policy and practices in geochemistry with ten other disciplinary groups.

Rebecca Farrington, Alexander Prent, Lesley Wyborn, Tim Rawling, Marthe Klöcking, Kerstin Lehnert, Kirsten Elger, Geertje ter Maat, Dominik Hezel, and Simon Hodson

‘WorldFAIR: Global cooperation on FAIR data policy and practice’ is a European Commission funded project composed of 11 discipline and cross-discipline case studies drawn together by CODATA, the Committee on DATA of the International Science Councils Committee on DATA, and is supported by the Research Data Alliance. WorldFAIR is a diverse, global community effort that currently has 19 partners located in Africa, Australasia, Europe, and North and South America, representing organisations from research, government and industry. The 11 individual case studies are drawn from Chemistry, Nanomaterials, Geochemistry, Social Surveys, Population Health, Urban Health, Biodiversity, Agriculture, Oceans, Disaster Risk Reduction and Cultural Heritage. The WorldFAIR project aims to focus on the interoperability and reusability of research data products from both within and across disciplines by creating a Cross-Domain Interoperability Framework (CDIF).

The foundation of the CDIF will be a series of FAIR Implementation Profiles (FIPs) which will be used as a methodology for individual communities to express their FAIR practices and decisions for each of the 15 individual FAIR guiding principles.

As an example of how this will work, the WorldFAIR’s Geochemistry case study is led by OneGeochemistry, an international network of national geochemical data infrastructure organisations. Initially an informal network with representatives from AuScope (Australia), GEOROC (Germany), EPOS Multi-scale Laboratories (Europe), EarthChem (US) and AstroMaterials (US). With the advent of WorldFAIR, OneGeochemistry has formalised it’s governance structure and is now a CODATA Work Group. Over the life of WorldFAIR, OneGeochemistry will work towards developing a community prototype FAIR Implementation Profile(s) for individual geochemical techniques, including the minimum defined variables, through workshops and consultations, and subsequently be responsible for their communication, publication and dissemination. The Geochemistry case study will work closely with the Chemistry case study and leverage relevant chemical standards and vocabularies wherever possible.

Through the development of community lead FAIR Implementation Profile(s) for geochemistry within a global Cross-Domain Interoperability Framework (CDIF), WorldFAIR and OneGeochemistry are both advancing the adoption of the FAIR data principles within Geochemistry and simultaneously enabling interoperability of geochemical research data products across the other ten discipline case studies.

How to cite: Farrington, R., Prent, A., Wyborn, L., Rawling, T., Klöcking, M., Lehnert, K., Elger, K., ter Maat, G., Hezel, D., and Hodson, S.: The WorldFAIR project: enabling global interdisciplinary cooperation on integrating FAIR Data policy and practices in geochemistry with ten other disciplinary groups., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15749, https://doi.org/10.5194/egusphere-egu23-15749, 2023.

EGU23-15909 | Posters on site | ESSI2.9

Access to facilities, FAIR data and related services: the ITINERIS project

Carmela Cornacchia and Ilaria Rosati

The Italian Integrated Environmental Research Infrastructures System (ITINERIS) Project started in November, and it will build the Italian Hub of Research Infrastructures in the environmental scientific domain providing access to data and services and supporting the Country to address current and expected environmental challenges. ITINERIS coordinates a network of national nodes from 22 RIs (18 from the environmental domain, 2 from agri-food with strong link with the environment and 2 from the PSE domain, supporting services for the marine domain).

ITINERIS has been designed looking at synergy with the European RI framework, and it will support the participation of Italian scientists in pan-European initiatives (ENVRI-FAIR, EOSC) and in HE (Pillar 1, Missions, Partnerships, Clusters). ITINERIS will have significant impact on national environmental research, providing scientific support to the design of actionable environmental strategies. ITINERIS adopts a whole-system, cross-disciplinary approach to the Earth System and its changes, allowing users to benefit from the integrated system of RIs and the knowledge it produces. This broad-scale vision of environmental research, sustained by the main Italian environmental scientists involved in European RIs, is truly innovative and it will support our Country in taking a leading role in European environmental research, designing the framework for the next decades.

A specific objective will be focused on the access to facilities, FAIR data and related services connecting the established network of distributed national environmental research infrastructures to as wide as possible user community. Following a user-centric approach and in accordance with the RIs’ network technical capability and mission, access services to the national RIs’ facilities and FAIR resources (data, services and other research outputs) will be set up. Most of the 22 RIs participating in ITINERIS already offer data and services from and across different domains of the Earth system - Atmosphere, Hydrosphere, Terrestrial Biosphere and Geosphere through different systems, protocols, portals and different access and FAIRification procedures.

ITINERIS builds upon this current effort to improve Access management and FAIRness, developing the conditions for harmonizing standards, metadata and policies amongst the different RIs. It has to be considered that involved RIs are very diverse and on different levels of maturity but face similar challenges in their operations in regard to FAIR compliance and Access management.

How to cite: Cornacchia, C. and Rosati, I.: Access to facilities, FAIR data and related services: the ITINERIS project, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15909, https://doi.org/10.5194/egusphere-egu23-15909, 2023.

EGU23-1370 | Orals | ESSI2.10

DestinE Core Service Platform Framework

Inés Sanz-Morère and Kathrin Hintze

Destination Earth (DestinE) Initiative is a programme of the European Commission’s DG-CNECT directorate for developing and exploiting a highly accurate digital model of the Earth, with the objectives of monitoring and predicting the interactions between natural phenomena and human activities. This initiative will support the European Commission to achieve the sustainable development objectives and will contribute to the European Green Deal and Digital Strategy. Along with the two entrusted entities ECMWF and EUMETSAT, ESA contributes in DestinE development and implementation. Concretely, ESA is responsible of developing the platform serving as single access point to users to DestinE ecosystem. DestinE Core Service Platform (DESP) integrates and operates an open ecosystem of services (also referred to as DESP Framework) to support DestinE-data exploitation and information sharing for the benefit of DestinE users and Third-Party entities. The industrial activity presented here corresponds to the starting point for DESP development. It includes the setup and definition of DESP Framework, as well as the implementation of key essential services such as user identification, authentication, and authorization service; infrastructure as a service with storage, network, and CPU/GPU capabilities; data access and retrieval service, in particular from the DestinE Data Lake operated by EUMETSAT, as it is the backbone for the data generated by ECMWF’s Digital Twin Engine; data traceability and harmonization services; basic software suite service for local data exploitation; data and software catalogue services; and 2D/3D data visualization service.

How to cite: Sanz-Morère, I. and Hintze, K.: DestinE Core Service Platform Framework, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1370, https://doi.org/10.5194/egusphere-egu23-1370, 2023.

EGU23-5049 | Posters on site | ESSI2.10

Software infrastructure development on the LUMI supercomputer to deploy the next generation Earth system models

Narayanappa Devaraju, Pekka Manninen, Jussi Enkovaara, Henrik Nortamo, Daniel Klocke, Jenni Kontkanen, and Mario Acosta

The Destination Earth (DestinE) climate adaptation digital twin (climate DT) will be run on the EuroHPC supercomputers LUMI and Marenostrum. The climate DT project brings together leading centers from across Europe specializing in climate science and services, Earth system modelling, and supercomputing to deliver an innovative climate information system through DestinE supporting the European Union’s climate adaptation efforts. Usage of the Earth-system model simulations requires an orchestration of a number of software tools that are developed and implemented as part of the climate DT project. Here, we present the progress of the software infrastructure development and support on LUMI to deploy the next generation Earth system models ICON (Icosahedral Nonhydrostatic) and IFS (Integrated Forecasting System) coupled with the ocean models NEMO (Nucleus for European Modelling of the Ocean) and FESOM2 (Finite-volumE Sea ice-Ocean Model). Our priority is to provide the computing environment needed on LUMI for (i) the development and execution of the climate models, impact models and data analysis utilities, (ii) the infrastructure components needed in terms of data storage, discovery, and quality control both internally and towards a link with the DestinE data services, and (iii) the software development tools and optimization support during the development of the climate models to ensure LUMI is efficiently used to reach the desired throughput. Several software tools have been deployed on LUMI successfully, for instance, MultIO the I/O server to execute the climate models outputs, FDB (Fields Data Base) for the data storage and access, Autosubmit and ecFlow for orchestration of workflows on LUMI which configures, executes, and monitors the climate DT experiments. Further, GPU adaptation of ICON and IFS-NEMO/FESOM2 models on LUMI-G (GPU) is undertaken. At the moment IFS and ICON compile and run successfully on LUMI-C (CPU). Profiling and performance analysis of the Earth system models (IFS-NEMO/FESOM2 on LUMI-C and ICON on LUMI-G) will be presented with throughput and performance numbers for the baseline test case configurations. Issues and challenges will be discussed in detail in the presentation.

How to cite: Devaraju, N., Manninen, P., Enkovaara, J., Nortamo, H., Klocke, D., Kontkanen, J., and Acosta, M.: Software infrastructure development on the LUMI supercomputer to deploy the next generation Earth system models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5049, https://doi.org/10.5194/egusphere-egu23-5049, 2023.

EGU23-5087 | ECS | Posters on site | ESSI2.10

Assessing future water availability using HydroRiver – A use case in the climate adaptation digital twin of the Destination Earth Program

Aparna Chandrasekar, Andreas Marx, Sebastian Müller, Ehsan Sharifi, Jeisson Javier Leal Rojas, and Stephan Thober

The Sixth Assessment Report from the Intergovernmental Panel on Climate Change emphasized on the water cycle, and water-related disasters (i.e., water scarcity, droughts, floods) that impact all sectors and regions. Therefore, assessing future water availability is critical to develop mitigation strategies and formulate adaptation policies. While developing relevant information systems, it is critical to ensure the involvement of stakeholders in the field of policy development, communities impacted by future water availability (e.g., agriculture, fisheries, shipping industry), and private industries (e.g., paper and pulp, hydropower) to ensure that information presented can be useful to support decision making. Destination Earth (DestinE) aims to, among other products, to develop – on a global scale – a highly accurate digital model of the Earth to monitor and predict the interaction between natural phenomena and human activities. As part of the European Commission’s Green Deal and Digital Strategy, DestinE will contribute to achieving the objectives of the twin transition, green and digital.

High resolution climate simulations (ICON and IFS climate models) are used as meteorological forcings for the mesoscale Hydrological Model (mHM) to produce high temporal (1 hour) and spatial resolution (5 km) streamflow estimates at a global scale. The impact model consists of the mHM model, which includes key hydrological processes e.g., run-off, soil moisture dynamics, fast and slow interflow processes to estimate river discharge. The application prototype will provide: 1) co-designing the indicators and indices as well as application functionalities together with relevant stakeholders. 2) downscaling of the essential climate variables 3) providing bias correction for the climate variables 4) running the mHM model under various climate scenarios. In addition, the application will receive data through direct streaming from the climate simulations thus ensuring interactivity of the application for the users.

During the development phase of the DestinE digital twin, the climate simulations used in the current work are taken from the results of the NextGEMS project. They have been used to provide a proof of concept for the mHM model, and provide initial results for stakeholder engagement, and enable early involvement of stakeholders in the co-design of relevant applications.

How to cite: Chandrasekar, A., Marx, A., Müller, S., Sharifi, E., Javier Leal Rojas, J., and Thober, S.: Assessing future water availability using HydroRiver – A use case in the climate adaptation digital twin of the Destination Earth Program, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5087, https://doi.org/10.5194/egusphere-egu23-5087, 2023.

EGU23-5674 | Posters on site | ESSI2.10

Digital Twinning of Geophysical Extreme Phenomena (DT-GEO)

Ramon Carbonell, Arnau Folch, Antonio Costa, Beata Orlecka-Sikora, Piero Lanucara, Finn Løvholt, Jorge Macias, Sascha Brune, Alice-Agnes Gabriel, Sara Barsotti, Joern Behrens, Jorge Gomes, Jean Schmittbuhl, Carmela Freda, Joanna Kocot, Domenico Giardini, Michael Afanasiev, Helen Galves, and Rosa Badia

Destination Earth initiative pursues the implementation of a digital model of the Earth. With the aim to help understand and simulate the evolution and behavior of the Earth system components, to aid in better forecasting the impacts on human system processes, ecosystem processes and their interaction. The current state of the art technologies in numerical computations (HPC), data infrastructures (involving data storage, data access, data analysis), enable the possibility of developing numerical clones mimicking Earth’s geophysical extreme phenomena.A Digital Twin for GEOphysical extremes (DT-GEO),is a new EU project funded under the Horizon Europe programme (2022-2025), with the objective of developing a prototype for a digital twin on geophysical extremes including earthquakes, volcanoes, tsunamis, and anthropogenic-induced extreme events. It will enable analyses, forecasts, and responses to “what if” scenarios for natural hazards from their genesis phases and across their temporal and spatial scales. The project consortium brings together world-class computational and data Research Infrastructures (RIs), operational monitoring networks, and leading-edge research and academia partnerships in various fields of geophysics. It mergesthe latest outcomes from other European projects and, Centers of Excellence. DT-GEO will deploy and test 12 Digital Twin Components (DTCs). These will be self-contained entities embedding flagship simulation codes, Artificial Intelligence layers, large volumes of (real-time) data streams from and into data-lakes, data assimilation methodologies, and overarching workflows for deployment and execution of single or coupled DTCs in centralized HPC and virtual cloud computing Ris. (DT-GEO: A Digital Twin for GEOphysical extremes, project ID 101058129)

How to cite: Carbonell, R., Folch, A., Costa, A., Orlecka-Sikora, B., Lanucara, P., Løvholt, F., Macias, J., Brune, S., Gabriel, A.-A., Barsotti, S., Behrens, J., Gomes, J., Schmittbuhl, J., Freda, C., Kocot, J., Giardini, D., Afanasiev, M., Galves, H., and Badia, R.: Digital Twinning of Geophysical Extreme Phenomena (DT-GEO), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5674, https://doi.org/10.5194/egusphere-egu23-5674, 2023.

EGU23-6122 | Orals | ESSI2.10

Destination Earth On-Demand Extremes Digital Twin

Roger Randriamampianina and the On-Demand Extremes Digital Twin Team

Destination Earth (DE) On-Demand Extremes Digital Twin (DT) (DE_330_MF) offers configurable digital twin engines for forecasting environmental extremes at a sub-km scale. DE_330_MF aims at providing an on-demand workflow with co-design of high resolution predictions about extreme weather events combined with decision making support for impact sectors including hydrology, air quality and energy meteorology, with the use of a physics-based and data driven model system and computationally-intensive data-flow organised on the EuroHPC high performance computing platform.

The DE_330_MF core software system is developed in four parts. Firstly, code adaptation needed for running on hybrid CPU and accelerator architectures will be done. Here, our most computationally intensive parts will be optimised for running on accelerators at the end of the first phase. Secondly, our modelling system is being adapted for running on a sub-km grid scale in an efficient on-demand setup. During phase 1 we primarily use operationally and pre-operationally mature components. Tailor-made modules targeted on impact modelling (e.g. wind energy) will be implemented as well. Thirdly, we work on post-processing of the model output, so that this can be distilled into a limited number of variables essential for end-users including uncertainty quantification for these variables. The approaches will be tailor-made for the specific targeted types of extreme events. Triggers for detecting extreme events are being developed as part of the post-processing work. Fourthly, and very importantly, we are designing and developing workflow management tools and scripting for the core software. Input and output data are also managed, with a particular focus on the exploitation of using high-density observations in high-resolution weather forecasting and verification. Overall we aim at running a hectometric scale On-Demand Extremes DT with a typical resolution of 750 or 500 m, exploring a finer resolution of up to 200 m for special applications where it is meaningful. In the first phase, capability demonstration is the focus by examining a range of carefully selected high-impact cases for the three above-mentioned impact areas, as well as other applications such as agriculture. The goal here is to explore the added values with co-designing workflow combining the on-demand, event- or user-driven solutions with weather forecasting on the one hand, and impact sectors on the other hand, in an integrated value chain for decision-making support. In this effort, containerised solutions, including one hydrological and two air quality models, will run next (one way coupling) to the on-demand extremes DT.

This work is funded by the EU under agreement DE_330_MF between ECMWF and Météo-France. The on-demand capability proposed by the Météo-France led international partnership is a key component of the weather-induced extremes digital twin, which ECMWF will deliver in the first phase of Destination Earth, launched by the EC.

How to cite: Randriamampianina, R. and the On-Demand Extremes Digital Twin Team: Destination Earth On-Demand Extremes Digital Twin, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6122, https://doi.org/10.5194/egusphere-egu23-6122, 2023.

EGU23-6365 | Orals | ESSI2.10

Towards a Digital Twin of the Earth: ECMWF's effort to build a Kilometre-scale Earth System Model

Benoît Vannière, Irina Sandu, Estibaliz Gascon, Richard Forbes, Inna Polichtchouk, Annelize Van Niekerk, Birgit Suetzl, Michail Diamantakis, Peter Bechtold, and Gianpaolo Balsamo

This presentation gives an overview of the work undertaken at ECMWF, in the Destination Earth initiative of the European Comission, to build the global continuous component of the Weather-induced and Geophysical Extremes Digital Twin (Extremes DT) of the Earth. The Extremes DT aims to forecast and monitor extreme weather, globally, at a range of around 5 days, with unprecedented fidelity.

The Extremes DT utilizes the ECMWF Integrated Forecasting System cycle 48r1, with the Tco2559 grid, which has a horizontal resolution of 4.5 km. Our evaluation strategy is based on both 5-day forecasts of extreme weather cases and forecasts initialized daily over one summer and winter seasons. The performance of these simulations is compared to that of ECMWF’s operational deterministic 9km forecasts, using the following metrics: forecast skill as compared to the operational analysis or the observations, and ability to capture extreme weather events.

The results show a clear added-value of higher resolution for near-surface fields and the predicted precipitation amounts in regions of high orography. However, some aspects of the forecasts, which were initially degraded, have required additional developments and tunings. For instance, the atmospheric circulation over the Tibetan plateau has a clear dependence on time-step and resolution, which is linked to the treatment of the mean and sub-grid orography, as well as the representation of orographic gravity waves. Additionally, temperature biases in the lower troposphere in the Tropics are likely due to the sensitivity of moist physics to the model time-step. We will present the choices made to reduce those biases.

This work can offer valuable insights into strategies for evaluating and improving kilometric-scale Earth System Models.

How to cite: Vannière, B., Sandu, I., Gascon, E., Forbes, R., Polichtchouk, I., Van Niekerk, A., Suetzl, B., Diamantakis, M., Bechtold, P., and Balsamo, G.: Towards a Digital Twin of the Earth: ECMWF's effort to build a Kilometre-scale Earth System Model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6365, https://doi.org/10.5194/egusphere-egu23-6365, 2023.

EGU23-7081 | Posters virtual | ESSI2.10

Urban heat maps in support of EU adaptation policy (u-MAP)

Filip Lefebre, Dirk Lauwaet, Niels Souverijns, and Nele Veldeman

In this presentation, we will present and discuss a Destination Earth use case that intends to provide present and future high-resolution urban heat maps for cities across Europe to underpin and motivate urban climate adaptation measures that are being developed within the EU’s cohesion policy and the development of EU policies related to the urban environment.

In recent years, climate change has been causing increasingly frequent and intense heatwaves in Europe. Climate projections indicate that the population exposure to extreme heat will rise by more than an order of magnitude towards the end of the century. Cities are especially at risk because of the urban heat island phenomenon, which may cause a doubling in the annual number of heatwave days compared to rural areas.

Within the EU, large geographical disparities exist in present and future (projected) exposure to heat, cities and regions in South and Central Europe exhibiting a higher risk. The monitoring of such disparities is important for Europe’s cohesion policy, which aims to develop measures to assist the economic and social development of the EU’s less-favoured regions. While some monitoring of urban heat patterns has been achieved by means of thermal infrared remote sensing, this type of information does not allow to derive sectoral indicators; the high-resolution urban climate information required to do so is currently lacking.

The urban heat maps will be generated by means of a physics-based urban climate model, UrbClim, nested within large-scale atmospheric output provided by the DestinE adaptation digital twin.

How to cite: Lefebre, F., Lauwaet, D., Souverijns, N., and Veldeman, N.: Urban heat maps in support of EU adaptation policy (u-MAP), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7081, https://doi.org/10.5194/egusphere-egu23-7081, 2023.

EGU23-7177 | Orals | ESSI2.10

Destination Earth Data Lake

Jordi Duatis Juarez, Michael Schick, Danaele Puechmaille, Miruna Stoicescu, and Borys Saulyak

Destination Earth is an operational service under the lead of the European Commission being implemented jointly by ESA, ECMWF and EUMETSAT.

The presentation will provide insights into the EUMETSAT Data Lake Service component of the Destination Earth undertaking.

The objective of the European Commission’s Destination Earth (DestinE) initiative is to deploy several highly accurate digital replicas of the Earth (Digital Twins) in order to monitor and simulate natural as well as human activities and their interactions, to develop and test “what-if” scenarios that would enable more sustainable developments and support European environmental policies. DestinE addresses the challenge to manage and make accessible the sheer amount of data generated by the Digital Twins and observation data located at external sites such as the ones depicted in the figure below. This data will be made available fast enough and in a format ready to support analysis scenarios proposed by the DestinE service users.

Figure: DestinE Data Sources (green) and Stakeholders (orange)

The “DestinE Data Lake” (DEDL) is one of the three Destination Earth components interacting with:

the Digital Twin Engine (DTE), which runs the simulation models, under ECMWF responsibility
the DestinE Core Service Platform (DESP), which represents the user entry point to the DestinE services and data, under ESA responsibility

The DestinE Data Lake (DEDL) fulfils the storage and access requirements for any data that is offered to DestinE users. It provides users with a seamless access to the datasets, regardless of data type and location. Furthermore, the DEDL supports big data processing services, such as near-data processing to maximize throughput and service scalability. The data lake is built inter alia upon existing data lakes such as Copernicus DIAS, ESA, EUMETSAT, ECMWF as well as complementary data from diverse sources like federated data spaces, in-situ or socio-economic data. The DT Data Warehouse is a sub-component of the DEDL which stores relevant subsets of the output from each digital twin (DT) execution being powered by ECMWFs Hyper-Cube service.

How to cite: Duatis Juarez, J., Schick, M., Puechmaille, D., Stoicescu, M., and Saulyak, B.: Destination Earth Data Lake, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7177, https://doi.org/10.5194/egusphere-egu23-7177, 2023.

EGU23-7944 | Posters on site | ESSI2.10

Plume: A Plugin Mechanism for Numerical Weather Prediction Models

Antonino Bonanni, James Hawkes, and Tiago Quintino

Plume (Plugin Mechanism) is a plugin system for Numerical Weather Prediction models (NWP) developed by ECMWF as part of the EU Destination Earth initiative. Plume loads plugins at runtime and lets them access model data in memory, through a well-defined interface. Plume plugins offer scientists and third-parties a controlled and flexible execution environment which allows them to easily extend specific functionalities of the model. Plume complements a collection of ECMWF tools that address the demand for scalability, given the very high resolution of the upcoming Destination Earth digital twins (DT) and the continuous increases in resolution of ECMWF's operational forecasts.

Examples of plugin applications include regional sub-models, light-weight data extraction and analysis and even machine learning algorithms. Building these applications as plugins with direct access to in-memory model data, rather than as separate applications coupled via the IO system, creates a more efficient and scalable system for very-high resolution forecasts. Plume is designed to allow coupling of plugins to different NWP models, including ECMWF's IFS, using a common interface for NWP data (Atlas). Plume is primarily written in C++, but supports plugins written in C++ and Fortran.

This presentation focuses on Plume architecture, implementation and initial interface design. Some plugin examples are used to demonstrate how Plume can access and interact with the ECMWF Integrated Forecasting System (IFS) and extend some specific functionalities.

How to cite: Bonanni, A., Hawkes, J., and Quintino, T.: Plume: A Plugin Mechanism for Numerical Weather Prediction Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7944, https://doi.org/10.5194/egusphere-egu23-7944, 2023.

EGU23-8839 | ECS | Posters on site | ESSI2.10

Polytope: Feature Extraction for Improved Access to Petabyte-Scale Datacubes

Mathilde Leuridan, James Hawkes, and Tiago Quintino

ECMWF currently produces about 120 TiB of raw weather data from its real-time forecasts every day. With model improvements and higher resolution forecasting however, this raw data is expected to grow to over a petabyte per day over the next few years. Whilst these improvements will in theory help scientists better forecast weather events, distributing such vast amounts of data efficiently will prove to be increasingly difficult with the current data access mechanisms.

To tackle this challenge, ECMWF is developing a novel feature extraction concept, named “Polytope”. By leveraging tools in the field of higher dimensional computational geometry, Polytope will be able to efficiently cut a wide range of intricate n-dimensional shapes (polytopes) from ECMWF’s high-dimension (6D/7D) weather datacubes. Polytope can be used to perform server-side feature extraction, providing significant data reductions before delivering the data to the user. This not only improves the efficiency of access to petabyte-scale datacubes, but also removes significant post-processing complexity from the user leading to an overall data usability improvement. Practical examples include a user requesting weather data over a 4-dimensional flight path, which crosses three spatial axes as well as a temporal axis. In this example, instead of providing data over the entire bounding box of the path, Polytope will only return the few precise bytes of interest to the user.

This work is an important contribution to, and is funded by, the EU’s Destination Earth initiative. Within Destination Earth, Polytope will enable efficient access to petabyte-scale datacubes generated by very high-resolution digital twins. This presentation will introduce the Polytope concept and demonstrate its usage for different types of feature extraction.

How to cite: Leuridan, M., Hawkes, J., and Quintino, T.: Polytope: Feature Extraction for Improved Access to Petabyte-Scale Datacubes, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8839, https://doi.org/10.5194/egusphere-egu23-8839, 2023.

EGU23-9094 | Posters on site | ESSI2.10

Tailoring tsunami Digital-Twins for future Destination Earth integration

Finn Løvholt, Manuela Volpe, Andrey Babeyko, Fabrizio Romano, Steven Gibbons, Manuel Castro Diaz, Jorge Macias, Stefano Lorito, Jörn Behrens, Anne Mangeney, and Alice Gabriel

Probabilistic Tsunami Forecasting (PTF) combines early estimates of earthquake parameters with large ensembles of urgent shallow water tsunami propagation simulations using the GPU based Tsunami-HySEA model (Selva et al. 2021, Nature Commun.). The present version of the PTF is initialised by the earthquake information, but not updated further with new data. In the recently started Horizon Europe project DT-GEO, this PTF is presently being upgraded to a Digital Twin. The first and essential upgrade to realise the Digital Twin is continuous data assimilation enabling a close to real time synthesis of data products and a set of numerical models that allow an updating of the model forecast as new data are continuously assimilated into the model. In DT-GEO, an extended set of data sources, including improved earthquake solutions, sea level tsunami data, and GNSS, will be integrated. A second objective of the PTF is to implement a modularised Digital Twin Component that allows for the inclusion of improved wave and source physics through dispersion, non-hydrostatic tsunami generation, inundation, improved earthquake physics, and cascading earthquake triggered landslide tsunamis. The model will be tested at site demonstrators, in the Mediterranean Sea for eastern Sicily and Samos, and in the Pacific Ocean for Chile and eastern Japan. The presentation will explain how the PTF as it works today, followed by an outline of the design of the components in the Digital Twin, as well as briefly describing initial improvements and plans for further development, including potential integration into Destination Earth. This work is supported by the European Union’s Horizon Europe Research and Innovation Program under grant agreement No 101058129 (DT-GEO, https://dtgeo.eu/).

How to cite: Løvholt, F., Volpe, M., Babeyko, A., Romano, F., Gibbons, S., Castro Diaz, M., Macias, J., Lorito, S., Behrens, J., Mangeney, A., and Gabriel, A.: Tailoring tsunami Digital-Twins for future Destination Earth integration, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9094, https://doi.org/10.5194/egusphere-egu23-9094, 2023.

EGU23-9262 | Orals | ESSI2.10

Compound flood forecasting and climate adaptation Destination Earth digital twin

Albrecht Weerts, Kun Yan, and Frederiek Sperna Weiland

Coastal deltas are extremely susceptible to flooding from sea, rivers, heavy rain and even more severe combinations thereof. Many coastal deltas are densely populated, and flood risk forms a serious threat that will likely increase in the future. There are two main mechanisms to reduce the devastating impacts of these floods; (1) adaptation to the increasing climate risks and (2) improved early warning and emergency response. We will present the Destination Earth digital twin on coastal compound flood inundation forecasting and climate adaptation.

This digital replica of the delta will connect DE data to impact models and will generate user oriented actionable output. High-resolution meteorological forecasts and regular updated climate scenarios form the input for regional coastal and inland models that provide boundary conditions for local models. Local models are used for simulation of small-scale hydrological processes, high-resolution 2D compound flood inundation and impact on population and buildings. A flexible data post-processing routine, to be co-designed with the end-users, will be implemented to translates model results into actionable context-specific information. It will, for example, produce flood inundation and risk maps as well as other user-requested indicators. For the climate adaptation component, the system will allow users to run WhatIf scenarios to explore adaptation measures. The data products will be disseminated to the users via Web-API, WEB-mapping service or through (S)FTP, depending on the users’ preferences.

We will outline and demonstrate our digital twin that serves both flood impact reduction mechanisms, i.e. climate adaptation and forecasting extremes. We will highlight end users’ needs and requirement that are involved through co-creation/design of the service.

How to cite: Weerts, A., Yan, K., and Sperna Weiland, F.: Compound flood forecasting and climate adaptation Destination Earth digital twin, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9262, https://doi.org/10.5194/egusphere-egu23-9262, 2023.

EGU23-10107 | ECS | Posters on site | ESSI2.10

Heat waves in urban environments - A Destination Earth use case in the Climate Adaptation Digital Twin

Aleks Lacima, Katherine Grayson, Roberto Chavez, Gert Versteeg, Nube Gonzalez-Reviriego, Francisco J. Doblas-Reyes, and Albert Soret

Current climate projections point towards a severe increase in the intensity, duration and frequency of heat waves under climate change conditions. Such changes are not homogeneous, with certain regions of the planet presenting a higher vulnerability to these extreme events and, therefore, greater adaptation challenges. Among the areas affected by heat waves, urban environments are particularly susceptible to their impacts due to the urban heat island (UHI) effect, which magnifies the severity of heat waves inside cities and significantly increases the health-related risks associated with heat stress.

Simulations produced by Global Climate Models (GCMs) (e.g. CMIP) are of crucial importance to better understand how the Earth’s climate system will evolve in the coming decades. Unfortunately, their coarse resolution, typically above 100 x 100 km, makes them unable to resolve fine-scale physical processes, including urban-scale phenomena such as the UHI. High-resolution simulations are therefore required to accurately represent physical processes that remain hidden to models with coarser representations of the climate system. GCMs with km-scale grids and sub-hourly output frequency provide the ability to study heat waves at global, mesoscale or even local level, together with an enhanced (i.e. better in physical terms) representation of the large-scale circulation systems (e.g. Rossby waves) that give rise to heat waves.

In the framework of Destination Earth (DestinE), we are developing an urban use case for the Climate Adaptation Digital Twin (ClimateDT) that focuses on the climate impacts produced by extreme temperatures in urban environments. We will present the background and the current state of development of the use case, together with its associated challenges. Given the high-resolution simulations envisioned for the ClimateDT are not yet available, we will use NextGEMS cycle 2-3 data, which have similar characteristics, to present several climate indicators related to heat waves and human thermal comfort (e.g. UTCI, HWMI, EHF), with a particular focus on large metropolitan areas and their immediate surroundings, though results at global scale will be also assessed. Nonetheless, the previously mentioned high spatial and temporal resolutions imply unprecedented volumes of data, which, due to limited storage capacity, need to be streamed at model runtime, without the users ever having access to the full model output, but only to a fraction of it over a limited period of time. Therefore, the innovative streaming framework introduced by DestinE requires the use of one-pass algorithms to create statistical summaries of the simulated climate fields, which in turn places particular constraints on the development of the use case.

Together with other relevant statistics, these indicators will allow us to study the spatial and temporal variability of heat waves inside urban areas, a significant knowledge gap in current climate projections. The ultimate goal of our work is to provide useful knowledge to urban planners, both in terms of storylines and climate data, which can be of use towards designing more resilient cities that are better adapted to the impacts of heat waves.

How to cite: Lacima, A., Grayson, K., Chavez, R., Versteeg, G., Gonzalez-Reviriego, N., Doblas-Reyes, F. J., and Soret, A.: Heat waves in urban environments - A Destination Earth use case in the Climate Adaptation Digital Twin, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10107, https://doi.org/10.5194/egusphere-egu23-10107, 2023.

EGU23-10933 | Posters on site | ESSI2.10

User-driven climate model data streaming for climate adaptation

Francisco Doblas-Reyes, Christian Steger, Sami Niemelä, Jenni Kontkanen, Barbara Früh, Bjorn Stevens, Heikki Tuomenvirta, Roberto Chavez, Aleksander Lacima, Miguel Castrillo, and Katherine Grayson

The Destination Earth Climate Adaptation Digital Twin (Climate DT) will design and implement a climate information system running on pre-exascale high-performance computing platforms to support climate adaptation efforts. An overview of the overall Climate DT will be given by Kontkanen et al. (this session).

The Climate DT will provide global climate data for both the historical period and the near-term future with unprecedented spatial and temporal resolution. The downstream applications will access the full model state vector (MSV) at runtime. This will lead to an interactive system where applications harnessing the MSV can be added, removed, or modified as required by the user. The climate MSV, which contains both the prognostic and a large number of diagnosed variables, will be continuously streamed (understood as all the user-requested variables being available in a federated and curated repository for a limited period of time before being erased) at both high frequency and native resolution. The applications will be able to consume this data at runtime as it is streamed. This is equivalent to applications using all the model data they require in a similar manner as one observes a physical system with all the necessary detail to satisfy specific requirements. Additional functionalities will be provided to help the data consumers access relevant statistics, to speed up the data processing and facilitate the data reduction (e.g., on-the-fly bias adjustment). This approach reduces the entry-level requirements for applications to participate in this completely new approach to access climate information data sources. The applications have the possibility of not only interacting with the model to extract the required climate data and indicators on real time, but also iteratively contribute to the design of the experimental set up and request additional variables and indicators.

To illustrate the broadest possible applicability of the Climate DT concept, five different pilot use cases were selected for the co-design, implementation, user feedback and evaluation of the Climate DT. The selected use cases focus on wildfires, urban climate, river discharge, wind energy, and hydrometeorological applications. Another consumer of the MSV will be the climate model evaluation. Furthermore, the use cases will present technical recipes for users to access the data and link their applications or impact models to the digital twin.

Each use case has identified specific key users. A close exchange with these key users is foreseen to meet the user requirements. To ensure transferability of the work to other users, an exchange with a wider circle of users is foreseen at a more advanced phase at dedicated stakeholder meetings.

An overview of the use cases, the technical concepts and the ongoing user engagement and co-design activities will be given to demonstrate the novelty, potential and advantages the digital twin offers. The use cases will illustrate the progress beyond current practices that is possible with these new climate simulation workflow compared to the traditional way of delivering climate simulation.

How to cite: Doblas-Reyes, F., Steger, C., Niemelä, S., Kontkanen, J., Früh, B., Stevens, B., Tuomenvirta, H., Chavez, R., Lacima, A., Castrillo, M., and Grayson, K.: User-driven climate model data streaming for climate adaptation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10933, https://doi.org/10.5194/egusphere-egu23-10933, 2023.

EGU23-12669 | Orals | ESSI2.10

Towards building a vibrant DestinE User Community

Antonio Romeo, Jolanda Patruno, Efstratios Stylianidis, Eleni Karachaliou, Aikaterini Bakousi, Zachary Smith, and Rob Carrillo

The DestinE Use Cases Project, executed by a Consortium led by RHEA Group with the Aristotle University of Thessaloniki and Trust-IT, regards the selection and implementation of a first set of Use Cases meant to demonstrate the ability of the DestinE infrastructure in general, and the DestinE Service Platform (DESP) in particular, to provide actionable information and decision support to its end-users. The project aims also at actively engaging the broad community of DestinE stakeholders, gathering their requirements, and encouraging their direct involvement and guidance in the continuous evolution of the DestinE infrastructure towards the future Phases of the Initiative.

The establishment of a strong (both in terms of numbers of members and interactions) and vibrant DestinE User Community seems crucial in informing the successful and well targeted development and initial operations of the DESP as well as guiding the evolution and sustainability of the platform in later phases of DestinE to respond to the priorities set by European and International policy frameworks. The aim is to create a network where the continuous interactions amongst users/developers as well as stakeholders/partners will enhance the development and improvement of DestinE capabilities, but also catalyse cross-sectorial collaborations. This community will be cοmprised by multiple and diverse types of stakeholders including scientists, policy makers, industry representatives and the general public which are split in Communities of Practice per Use Case (i.e. groups of members who share a common interest in a particular domain area or scientific topic).

The community building activities follow a step-wise methodology to define Why (the objectives), What (the content), to Whom (the target/ stakeholder groups), Where (all channels and tools) and How (i.e. strategies, tools and channels suitable for each group). It also includes an action plan (When) along with clear quantitative targets and monitoring mechanisms, allowing also for ad-hoc and on-demand actions.

An open, transparent and inclusive invitation process is established that aims to embrace new members and various levels of participation. Formal community governance structure appoints key instances required for successful community management while standards and rules define how members participate and interact with one another encouraging for contributions made by all.

A DestinE Community Portal will be launched which intends to be the central hub for prospective users and downstream developers to interact with the initiative. It will showcase open calls for DestinE use cases and showcase funded ones and a requirements gathering area to allow the community to help co-develop solutions. It will also contain an early-stage Marketplace page of datasets, tools and apps, a developer section, a catalogue of DestinE-related projects, tools and results, and an e-Learning platform. For outreach, a “DestinE Roadshow” series will promote DestinE in various events, webinars and policy dialogues.

How to cite: Romeo, A., Patruno, J., Stylianidis, E., Karachaliou, E., Bakousi, A., Smith, Z., and Carrillo, R.: Towards building a vibrant DestinE User Community, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12669, https://doi.org/10.5194/egusphere-egu23-12669, 2023.

EGU23-13018 | Orals | ESSI2.10

Climate Digital Twin to support climate change adaptation efforts

Jenni Kontkanen, Mario Acosta, Pierre-Antoine Bretonnière, Miguel Castrillo, Paolo Davini, Francisco Doblas-Reyes, Barbara Früh, Jost von Hardenberg, Thomas Jung, Heikki Järvinen, Jan Keller, Daniel Klocke, Sami Niemelä, Bjorn Stevens, Stephan Thober, and Pekka Manninen

Climate change is expected to have far reaching impacts on human and natural systems during the 21stcentury. To guide policy decisions on climate change adaptation and mitigation, there is a need for developing new types of climate information systems that can provide timely information on regional and local impacts of climate change. The European Commission’s Destination Earth (DestinE) programme aims towards this by developing high precision digital twins (DT) of the Earth. We present here the overview of one of the two first priority DTs, Climate Change Adaptation DT. The Climate DT will encompass a pre-exascale climate information system that can support climate change adaptation efforts.

The Climate DT harnesses two different kilometer-scale Earth-system models (ESMs), ICON and IFS-FESOM/NEMO. The models will be adapted to two EuroHPC pre-exascale computing systems: LUMI that is currently in operation in Kajaani, Finland, and Mare Nostrum 5 that will be available during 2023 in Barcelona, Spain.

The Climate DT introduces the idea of a generic state vector (GSV), which is evolved by the ESMs and streamed to applications. This enables the ESMs to work at an unprecedented scale (multi-decadal simulations on 5km or finer global meshes) and thus improves the fidelity of the information the models provide as well as its relevance for the users. This approach also creates the basis for an information system that can scale across an unlimited number of applications and enable interactivity during the future phases of DestinE.

Use cases from different impact sectors are implemented within Climate DT as applications that operate on the streamed GSV. The Climate DT will explore five use cases which will provide information on (1) wind energy supply and demand, (2) wildfire risk and emissions, (3) river flows, (4) hydrometeorological extreme events, and (5) heat stress in urban environments. Additional applications operating on the GSV include a quality assessment and uncertainty quantification framework, used for monitoring and evaluation of the GSV. This framework will utilize observational operators to evaluate the GSV against observational data, and in so doing enable the use of tools from numerical weather prediction for quality assessment and observation-based model tuning.

In this presentation, we will give an overview of the Climate DT, describing the objectives for the first phase, technical design of the DT, and the progress made so far. The use cases and the new possibilities provided by the streaming of the GSV are discussed in more detail by Doblas-Reyes et al., and the development of the digital infrastructure for Climate DT by Narayanappa et al.

How to cite: Kontkanen, J., Acosta, M., Bretonnière, P.-A., Castrillo, M., Davini, P., Doblas-Reyes, F., Früh, B., von Hardenberg, J., Jung, T., Järvinen, H., Keller, J., Klocke, D., Niemelä, S., Stevens, B., Thober, S., and Manninen, P.: Climate Digital Twin to support climate change adaptation efforts, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13018, https://doi.org/10.5194/egusphere-egu23-13018, 2023.

EGU23-13132 | ECS | Orals | ESSI2.10

Adapting Energy Systems to a Changing Climate - A Destination Earth Use Case

Bruno Schyska, Andrzej Ceglarz, Léa Hayez, Alexander Kies, Marion Schroedter-Homscheidt, and Wided Medjroubi

In liberalized electricity markets, the planning and operation of the electricity grids, often, is done by private companies under the control of a public authority. Following their mandates, the European transmission system operators for electricity implement, among other analyses, national grid extension plans, the Europe-wide Ten-Year Network Development Plan and the European Resources Adequacy Assessment. To achieve evidence-based decision-making, they rely on the best-available meteorological and/or climatological information on a variety of scales, from the local distribution grid level to the European transmission system, and from short-term forecasts for the operation to climate scenarios for investment decisions. Here, the capabilities of the new DestinE Digital Twin on climate adaption can make a significant impact. It is expected that the explicit modelling of physical processes on the storm- and cloud-resolving scale also leads to a more realistic and accurate representation of the solar and wind resources. This offers great opportunities for improved energy system modelling and opens the door for new innovative approaches for adding the analysis of climate information into standard modelling workflows applied in the energy sector. However, there is a lack of knowledge about available meteorological data sets, their characteristics and the implications of using different data sets for grid planning and adequacy assessment activities in the user community. Standardized tools and methods to add the analysis of climate change and/or climate uncertainty to user workflows rarely exist. This hinders energy system modelers to make full use of the available meteorological information and, consequently, prevents users from tapping the full potential of the data. As energy systems become more dependent on weather, and as the uncertainties about climate change impacts rise, the operation and planning of integrated energy systems becomes an increasingly complex task.

Aim of this presentation is the introduction of a DestinE Use Case for the energy sector jointly implemented by DLR, the Renewables Grid Initiative (RGI) and Aarhus University. This Use Case will develop a representative Demonstrator exemplarily showcasing the use of climate information in the energy sector for grid planning and resources adequacy assessment purposes, and equip the user community with the tools, methods and the knowledge needed to ensure the safe and clean supply of energy in Europe in accordance with the Nationally Determined Contributions to the Paris Agreement and the European Union’s “Fit for 55” goals.

How to cite: Schyska, B., Ceglarz, A., Hayez, L., Kies, A., Schroedter-Homscheidt, M., and Medjroubi, W.: Adapting Energy Systems to a Changing Climate - A Destination Earth Use Case, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13132, https://doi.org/10.5194/egusphere-egu23-13132, 2023.

EGU23-13856 | Posters on site | ESSI2.10

MultIO: Message-Driven Data Routing for Distributed Earth-System Models

Domokos Sármány, Philipp Geier, Mirco Valentini, Simon Smart, James Hawkes, and Tiago Quintino

Traditionally, in numerical weather prediction, the computational cost of performing floating-point operations (flops) has been the primary concern. However, in the past couple of decades throughput to storage has become a significant bottleneck – a phenomenon often referred to as the input/output (I/O) performance gap. ECMWF runs time-critical operational weather forecasts four times a day, where the entire workflow of each run must complete within one hour. A single run currently produces around 30TiB of data, and it is expected to increase to hundreds of TiB within the next five-to-ten years. This is impossible on existing infrastructure without re-organising how model data is handled and processed.

The proposed solution to this problem in the field of weather and climate numerical models has two elements. One is to decouple data output from numerical computations and dedicate pre-defined processes, called I/O-servers, purely to data output. The other is to move computations of derived (post-processed) data closer to the original “raw” weather data, thus reducing the amount of data to be moved. The challenge then is how to route the combination of raw and post-processed data efficiently to storage without compromising the performance of the running model.

We present MultIO, an open-source software library developed at ECMWF for data routing from distributed parallel meteorological and earth-system models. It supports two distinct functionalities. First, it allows the creation of post-processing pipelines to calculate derived meteorological products, such as temporal pointwise statistics, interpolation onto different grids, encoding of data into output formats and output of data storage systems or other consumers. Second, it can act as an I/O-server, creating aggregated horizontal fields from distributed parallel meteorological and earth-system models.

MultIO is a key component of the ACROSS project, funded by the EuroHPC JU. It is also partly developed via ECMWF's participation in Destination Earth and is a component of the Digital Twin Engine (DTE).

How to cite: Sármány, D., Geier, P., Valentini, M., Smart, S., Hawkes, J., and Quintino, T.: MultIO: Message-Driven Data Routing for Distributed Earth-System Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13856, https://doi.org/10.5194/egusphere-egu23-13856, 2023.

EGU23-15620 | Posters on site | ESSI2.10

Interoperable digital twins of the ocean through aligned architectures

Bente Lilja Bye, Arne-Jørgen Berre, and Ute Brönner

Digital twins are designed to support decision-making and to make well-timed interventions that provide better outcomes. Thus, it is not only the digital twin itself that is important, but also the ease of creating actionable information or decisions, through policy, management or operational decisions. This implies the need for a well-formulated interface between the digital twin and a machine or human. Digital twins of the ocean comprise multiple geoscientific disciplines in itself and the thematic twins rely on both data and models based on science and technology that are interoperable. The EU Horizon2020 project ILIAD Digital Twin of the Ocean capitalizes on the explosion of new data provided by different Earth observation sources, advanced computing infrastructures (cloud computing, HPC, Internet of Things, Big Data, citizen science) in an inclusive, virtual/augmented, and engaging fashion to address all Earth data challenges. In addition, the EU is also building a broad foundational ecosystem for the European Digital Twin of the Ocean (EDITO) led by Mercator Ocean International, VLIZ et al. There are many other initiatives that work toward or in support of a Digital Twin of the Ocean, e.g. NOAA’s National Centre for Environmental Information, the IOC Ocean Data and Information System ODIS, the IOC Ocean Best Practice System OBPS, the Ocean Data Action Coalition and the UN Data Coordination Group. DITTO, a Global Program of the UN Decade of Ocean Science for Sustainable Development (2021-2030) aims to develop and share a common understanding of digital twins of the ocean (DTO); to establish best practice in the development of DTOs; and advance a digital framework for DTOs to empower ocean professionals from all sectors around the world including scientific users, to effectively create their own digital twins. One of the activities to implement this is the Interoperability ArchitecTURes for a DigiTaL OcEan (TURTLE). This presentation gives an overview of the initiatives and introduces the strategy and some of the tools for making these systems interoperable.

How to cite: Bye, B. L., Berre, A.-J., and Brönner, U.: Interoperable digital twins of the ocean through aligned architectures, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15620, https://doi.org/10.5194/egusphere-egu23-15620, 2023.

EGU23-15975 | ECS | Orals | ESSI2.10

DestinE-AQ: High-resolution air quality forecast, analysis, and scenario simulations coupled to the Digital Twin of DestinE

Anne Caroline Lange, Philipp Franke, Sabine Schröder, Felix Kleinert, Niklas Selke, Elmar Friese, and Martin G. Schultz

Extreme air pollution events of high concentrated surface ozone (O₃) or particulate matter (PM) pose a lethal threat to humans worldwide. To investigate air quality under extreme atmospheric situations, the DestinE-AQ use case develops a comprehensive user interface that enables high resolution air quality forecasts and analysis by combining numerical simulations, machine learning approaches and observations. The core of the system encloses access to the open database of global air quality observations (i.e. the Tropospheric Ozone Assessment Report data base, TOAR), innovative machine learning workflows (e.g. MLAir, IntelliO3-ts) including downscaling modules, and high resolution numerical simulations using the state of the art chemistry transport model EURAD-IM (EURopean Air pollution Dispersion- Inverse Model). The aspired air quality forecasts and analyses will dynamically be driven by the DestinE Digital Twin for weather extremes. Thus, the pursued horizontal resolution of the air quality simulations is identical to the resolution of the DestinE digital twin (< 1 km²).

To provide reliable air quality analyses, the system makes use of observational data in both the machine learning tools and EURAD-IM by enabling 3D-var data assimilation. While focusing on Europe, the system will demonstrate how observations and physics-based and data-driven models can be woven together to achieve enhanced realism and finer resolution of air pollution information and thus provide better support to decision makers. The system is complemented by an efficient ensemble module that enables emission scenario simulations to test and develop air pollution mitigation strategies for future extreme events under realistic conditions. The development of the user interface is done in close cooperation with the German and North Rhine-Westphalian Environment Agencies to meet the end users’ needs. The system will provide detailed information about air quality, its underlying chemical processes, the influence of meteorological extreme events, and the impacts of anthropogenic emissions on air quality. Hence, it will also serve the scientific community to answer questions on air quality and atmospheric chemical processes under extreme weather conditions that are expected to increase in future. To allow for the investigation of the human impact of extreme events, the DestinE-AQ focuses on the key air pollutants PM_2.5, nitrogen oxides (NO_x), and O₃ in the planetary boundary layer. The potential combination of the system with socio-economic and medical models will be evaluated.

How to cite: Lange, A. C., Franke, P., Schröder, S., Kleinert, F., Selke, N., Friese, E., and Schultz, M. G.: DestinE-AQ: High-resolution air quality forecast, analysis, and scenario simulations coupled to the Digital Twin of DestinE, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15975, https://doi.org/10.5194/egusphere-egu23-15975, 2023.

EGU23-15983 | Posters on site | ESSI2.10

Evolving hydrological flood forecasting systems with globally-configured on demand high-resolution services

Ursula McKnight, René Capell, and Peter Berg and the DE_330_MF Hydrology Team

High-intensity, short-duration extreme precipitation events are causing severe natural disasters worldwide, resulting in major infrastructure and property damages (mill. €/flood event) and loss of human life. The DE_330_MF project takes advantage of recent advances in short-term forecasting of these extremes, driven by expanding monitoring capabilities and enhanced high-resolution numerical weather prediction models. Breakthroughs, relative to the capabilities of established services, are under development with a focus on predictions at sub-km scales that should improve the description of precipitation extremes down to local scales.

Here we present our vision for how operational hydrological forecasters across Europe may interact with the on-demand weather-induced extremes digital twin (DT). We rely on the hydrology use case within DE_330_MF, and focus on the actions needed to ensure an efficient integration of hydrological systems with the DT. To ensure relevance for flood risk management, robustness and applicability across Europe, workflows are co-designed for both the DT software service infrastructure and the real-time flood warning decision-making for nine selected hydrological use cases representing different European countries/national warning services. These events were chosen to cover a broad range of flood types, geographic locations, spatial impact scales and resulting damages/fatalities. They are key in the co-design process, and to ensure sufficient on-demand capabilities and configurability options are demonstrated for flood risks across Europe.

Different levels of interaction with the DT are required to enable the ingestion of the globally-configured DT data into the diversity of existing nationally-driven hydrological modelling prediction systems. The added value of the DT data can be demonstrated by comparing hydrological simulation results using the DT forecasts with current resolution forecasts, and analysing how model output accuracy improves. In parallel, the pan-European Hydrological Predictions for the Environment (HYPE) model will be incorporated directly within the DT technical service platform, thereby demonstrating how hydrological models can be embedded in the DT structure. This will allow the DT to provide complementary hydrological information to all national warning services across Europe, which can be further evaluated by comparing with the nine national models implemented in the first step.

Action plans will also be co-created with societal users to ensure advancements produced by the Extremes DT can be implemented via the proposed actionable response scenarios. Use-case relevance and progress over existing capabilities will be co-evaluated with local partners/responsible agencies, giving options for enhancing the way the DT interacts with users and their ability to request on-demand information on extreme flood events, demonstration and user requests of the tailored workflows.

This work is funded by the EU under agreement DE_330_MF between ECMWF and Météo-France. The on-demand capability proposed by the Météo-France-led international partnership is a key component of the weather-induced extremes DT, which ECMWF will deliver in the first phase of Destination Earth, launched by the EC.This work is funded by the service contract 2022/DE_330_MF, an international partnership led by Météo-France under the digital twins ECMWF will deliver in the first phase of Destination Earth, launched by the EC.

How to cite: McKnight, U., Capell, R., and Berg, P. and the DE_330_MF Hydrology Team: Evolving hydrological flood forecasting systems with globally-configured on demand high-resolution services, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15983, https://doi.org/10.5194/egusphere-egu23-15983, 2023.

EGU23-244 | ECS | Posters on site | ITS1.5/GI1.5

High accuracy doesn’t prove that a deep learning model is accurate: a case study from automatic rock classification of thin section photomicrographs

Dongyu Zheng, Zhisong Cao, Li Hou, Chao Ma, and Mingcai Hou

As deep learning (DL) is gathering remarkable attention for its capacity to achieve accurate predictions in various fields, enormous applications of DL in geosciences also emerged. Most studies focus on the high accuracy of DL models by model selections and hyperparameter tuning. However, the interpretability of DL models, which can be loosely defined as comprehending what a model did, is also important but comparatively less discussed. To this end, we select thin section photomicrographs of five types of sedimentary rocks, including quartz arenite, feldspathic arenite, lithic arenite, dolomite, and oolitic packstone. The distinguishing features of these rocks are their characteristic framework grains. For example, the oolitic packstone contains rounded or oval ooids. A regular classification model using ResNet-50 is trained by these photomicrographs, which is assumed as accurate because its accuracy reaches 0.97. However, this regular DL model makes their classifications based on the cracks, cements, or even scale bars in the photomicrographs, and these features are incapable of distinguishing sedimentary rocks in real works. To rectify the models’ focus, we propose an attention-based dual network incorporating the microphotographs' global (the whole photomicrographs) and local features (the distinguishing framework grains). The proposed model has not only high accuracy (0.99) but also presents interpretable feature extractions. Our study indicates that high accuracy should not be the only metric of DL models, interpretability and models incorporating geological information require more attention.

How to cite: Zheng, D., Cao, Z., Hou, L., Ma, C., and Hou, M.: High accuracy doesn’t prove that a deep learning model is accurate: a case study from automatic rock classification of thin section photomicrographs, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-244, https://doi.org/10.5194/egusphere-egu23-244, 2023.

EGU23-1183 | ECS | Orals | ITS1.5/GI1.5 | Highlight

Detection of anomalous NO2 emitting ships using AutoML on TROPOMI satellite data

Solomiia Kurchaba, Jasper van Vliet, Fons J. Verbeek, and Cor J. Veenman

Starting from 2021 International Maritime Organization (IMO) introduced more demanding NO_x emission restrictions for ships operating in waters of the North and Baltic Seas. All methods currently used for ship compliance monitoring are financially and time-demanding. Thus, it is important to prioritize the inspection of ships that have a high chance of being non-compliant.

TROPOMI/S5P instrument for the first time allows a distinction of NO₂ plumes from individual ships. Here, we present a method for the selection of potentially non-compliant ships using automated machine learning (AutoML) on TROPOMI/S5P satellite data. The study is based on the analysis of 20 months of data in the Mediterranean Sea region. To each ship, we assign a Region of Interest (RoI), where we expect the ship plume to be located. We then train a regression model to predict the amount of NO₂ that is expected to be produced by a ship with specific properties operating in the given atmospheric conditions. We use a genetic algorithm-based AutoML for the automatic selection and configuration of a machine-learning pipeline that maximizes prediction accuracy. The difference between the predicted and actual amount of produced NO₂ is a measure of inspection worthiness. We rank the analyzed ships accordingly.

We cross-check the obtained ranks using a previously developed method for supervised ship plume segmentation. We quantify the amount of NO₂ produced by a given ship by summing up concentrations within the pixels identified as a “plume”. We rank the ships based on the difference between the obtained concentrations and the ship emission proxy.

Ships that are also ranked as highly deviating by the segmentation method need further attention. For example, by checking their data for other explanations. If no other explanations are found, these ships are advised to be the candidates for fuel inspection.

How to cite: Kurchaba, S., van Vliet, J., Verbeek, F. J., and Veenman, C. J.: Detection of anomalous NO2 emitting ships using AutoML on TROPOMI satellite data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1183, https://doi.org/10.5194/egusphere-egu23-1183, 2023.

EGU23-1583 | ECS | Orals | ITS1.5/GI1.5

Study of the effect of compaction on the hydrodynamic properties of a loamy sand soil for precision agriculture

Yasmin Mbarki and Silvio José Gumiere

Compaction of agricultural soil negatively affects its hydraulic proprieties, leading to water erosion and other negative effects on the quality of the environment. This study focused on the effect of compaction on soil hydrodynamic properties under unsaturated and saturated conditions using the Hydraulic Property Analyzer (HYPROP) system. We studied the impact of five levels of compaction among loam sand soils collected in a potato crop field in northern Québec, Canada. Soil samples were collected, and the soil bulk densities of the artificially compacted samples were developed by increasing the bulk density by 0% (C0), 30% (C30), 40% (C40), 50% (C50), and 70% (C70). First, the saturated hydraulic conductivity of each column was measured using the constant-head method. Soil water retention curve (SWRC) dry-end data and unsaturated hydraulic conductivities were obtained via the implementation and evaluation of the HYPROP evaporation measurement system and WP4-T Dew Point PotentioMeter equipment (METER group, Munich, Germany). Second, the soil microporosity was imaged and quantified using the micro-CT-measured pore-size distribution to visualize and quantify soil pore structures. The imaged soil microporosity was related to the saturated hydraulic conductivity, air permeability, porosity and tortuosity measured of the same samples. Our results supported the application of the Peters–Durner–Iden (PDI) variant of the bimodal unconstrained van Genuchten model (VGm-b-PDI) for complete SWRC estimation based on the root mean square error (RMSE). The unsaturated hydraulic conductivity matched the PDI variant of the unconstrained van-Genuchten model (VGm-PDI) well. Finally, the preliminary results indicated that soil compaction could strongly influence the hydraulic properties of soil in different ways. The saturated conductivity decreased with increasing soil compaction, and the unsaturated hydraulic conductivity changed very rapidly with the ratio of water to soil. Overall, the HYPROP methodology performed extremely well in terms of the hydraulic behavior of compacted soils.

How to cite: Mbarki, Y. and Gumiere, S. J.: Study of the effect of compaction on the hydrodynamic properties of a loamy sand soil for precision agriculture, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1583, https://doi.org/10.5194/egusphere-egu23-1583, 2023.

EGU23-1902 | Posters on site | ITS1.5/GI1.5

TACTICIAN: AI-based applications knowledge extraction from ESA’s mission scientific publications

Omiros Giannakis, Iason Demiros, Konstantinos Koutroumbas, Athanasios Rontogiannis, Vassilis Antonopoulos, Guido De Marchi, Christophe Arviset, George Balasis, Athanasios Daglis, George Vasalos, Zoe Boutsi, Jan Tauber, Marcos Lopez-Caniego, Mark Kidger, Arnaud Masson, and Philippe Escoubet

Scientific publications in space science contain valuable and extensive information regarding the links and relationships between the data interpreted by the authors and the associated observational elements (e.g., instruments or experiments names, observing times, etc.). In this reality of scientific information overload, researchers are often overwhelmed by an enormous and continuously growing number of articles to access in their daily activities. The exploration of recent advances concerning specific topics, methods and techniques, the review and evaluation of research proposals and in general any action that requires a cautious and comprehensive assessment of scientific literature has turned into an extremely complex and time-consuming task.

The availability of Natural Language Processing (NLP) tools able to extract information from scientific unstructured textual contents and to turn it into extremely organized and interconnected knowledge, is fundamental in the framework of the use of scientific information. Exploitation of the knowledge that exists in the scientific publications, necessitates state-of-the-art NLP. The semantic interpretation of the scientific texts can support the development of a varied set of applications such as information retrieval from the texts, linking to existing knowledge repositories, topic classification, semi-automatic assessment of publications and research proposals, tracking of scientific and technological advances, scientific intelligence-assisted reporting, review writing, and question answering.

The main objectives of TACTICIAN are to introduce Artificial Intelligence (AI) techniques to the textual analysis of the publications of all ESA Space Science missions, to monitor and evaluate the scientific productivity of the science missions, and to integrate the scientific publications’ metadata into the ESA Space Science Archive. Through TACTICIAN, we extract lexical, syntactic, and semantic information from the scientific publications by applying NLP and Machine Learning (ML) algorithms and techniques. Utilizing the wealth of publications, we have created valuable scientific language resources, such as labeled datasets and word embeddings, which were used to train Deep Learning models that assist us in most of the language understanding tasks. In the context of TACTICIAN, we have devised methodologies and developed algorithms that can assign scientific publications to the Mars Express, Herschel, and Cluster ESA science missions and identify selected named entities and observations in these scientific publications. We also introduced a new unsupervised ML technique, based on Nonnegative Matrix Factorization (NMF), for classifying the Planck mission scientific publications to categories according to the use of the Planck data products.

These methodologies can be applied to any other mission. The combination of NLP and ML constitutes a general basis, which has proved that it can assist in establishing links between the missions’ observations and the scientific publications and to classify them in categories, with high accuracy.

This work has received funding from the European Space Agency under the "ArTificiAl intelligenCe To lInk publiCations wIth observAtioNs (TACTICIAN)" activity under ESA Contract No 4000128429/19/ES/JD.

How to cite: Giannakis, O., Demiros, I., Koutroumbas, K., Rontogiannis, A., Antonopoulos, V., De Marchi, G., Arviset, C., Balasis, G., Daglis, A., Vasalos, G., Boutsi, Z., Tauber, J., Lopez-Caniego, M., Kidger, M., Masson, A., and Escoubet, P.: TACTICIAN: AI-based applications knowledge extraction from ESA’s mission scientific publications, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1902, https://doi.org/10.5194/egusphere-egu23-1902, 2023.

EGU23-2388 | ECS | Orals | ITS1.5/GI1.5

Deep learning based identification of carbonate rock components in core images

Harriet Dawson and Cédric John

Identification of constituent grains in carbonate rocks is primarily a qualitative skill requiring specialist experience. A carbonate sedimentologist must be able to distinguish between various grains of different ages, preserved in differing alteration stages, and cut in random orientations across core sections. Recent studies have demonstrated the effectiveness of machine learning in classifying lithofacies from thin section, core and seismic images, with faster analysis times and reduction of natural biases. In this study, we explore the application and limitations of convolutional neural network (CNN) based object detection frameworks to identify and quantify multiple types of carbonate grains within close-up core images. Nearly 400 images of carbonate cores we compiled of high-resolution core images from three ODP and IODP expeditions. Over 9,000 individual carbonate components of 11 different classes were manually labelled from this dataset. Using transfer learning, we evaluate one-stage (YOLO v3) and two-stage (Faster R-CNN) detectors under different feature extractors (Darknet and Inception-ResNet-v2). Despite the current popularity of one-stage detectors, our results show Faster R-CNN with Inception-ResNet-v2 backbone provides the most robust performance, achieving nearly 0.8 mean average precision (mAP). Furthermore, we extend the approach by deploying the trained model to ODP Leg 194 Sites 1196 and 1190, developing a performance comparison with human interpretation.

How to cite: Dawson, H. and John, C.: Deep learning based identification of carbonate rock components in core images, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2388, https://doi.org/10.5194/egusphere-egu23-2388, 2023.

EGU23-3997 | ECS | Orals | ITS1.5/GI1.5

Artificial Intelligence Models for Detecting Spatiotemporal Crop Water Stress in schedule Irrigation: A review

Elham Koohi, Silvio Jose Gumiere, and Hossein Bonakdari

Water used in agricultural crops can be managed by irrigation scheduling based on plant water stress thresholds. Automated irrigation scheduling limits crop physiological damage and yield reduction. Knowledge of crop water stress monitoring approaches can be effective in optimizing the use of agricultural water. Understanding the physiological mechanisms of crop responding and adapting to water deficit ensures sustainable agricultural management and food supply. This aim could be achieved by analyzing stomatal conductance, growth rate, leaf water potential, and stem water potential. Calculating thresholds of soil matric potential, and available water content improves the precision of irrigation management by preventing water limitations between irrigations. Crop monitoring and irrigation management make informed decisions using geospatial technologies, the internet of things, big data analysis, and artificial intelligence. Remote sensing (RS) could be applied whenever in situ data are not available. High-resolution crop mapping extracts information through index-based methods fed by the multitemporal and multi-sensor data used in detection and classification. Precision Agriculture (PA) means applying farm inputs at the right amount, at the right time, and in the right place. RS in PA captures images in different spatial, and spectral resolutions through in-field, satellites, aerial, and handheld or tractor-mounted such as unmanned aerial vehicles (UAVs) sensors. RS sensors receive the electromagnetic signals of plant responses in different spectral domains. Optical satellite data, including narrow-band multispectral remote sensing techniques and thermal imagery, is used for water stress detection. To process and analysis RS data, cloud storage and computing platforms simplify the complex mathematical of incorporating various datasets for irrigation scheduling. Machine learning (ML) algorithms construct models for the regression and classification of multivariate and non-linear crop mapping. The web-based software gathered from all different datasets makes a reliable product to reinforce farmers’ ability to make appropriate decisions in irrigating agricultural crops.

Keywords: Agricultural crops; Crop water stress detection; Irrigation scheduling; Precision agriculture; Remote Sensing.

How to cite: Koohi, E., Gumiere, S. J., and Bonakdari, H.: Artificial Intelligence Models for Detecting Spatiotemporal Crop Water Stress in schedule Irrigation: A review, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3997, https://doi.org/10.5194/egusphere-egu23-3997, 2023.

EGU23-6696 | ECS | Orals | ITS1.5/GI1.5 | Highlight

Satellite-based continental-scale inventory of European wetland types at 10m spatial resolution

Gyula Mate Kovács, Stefan Oehmcke, Stéphanie Horion, Dimitri Gominski, Xiaoye Tong, and Rasmus Fensholt

Wetlands provide invaluable services for ecosystems and society and are a crucial instrument in our fight against climate change. Although Earth Observation satellites offer cost-effective and accurate information about wetland status at the continental scale; to date, there is no universally accepted, standardized, and regularly updated inventory of European wetlands <100m resolution. Moreover, previous satellite-based global land cover products seldom account for wetland diversity, which often impairs their mapping performances. Here, we mapped major wetland types (i.e., peatland, marshland, and coastal wetlands) across Europe for 2018, based on high resolution (10m) optical and radar time series satellite data as well as field-collected land cover information (LUCAS) using an ensemble model combining traditional machine learning and deep learning approaches. Our results show with high accuracy (>85%) that a substantial extent of European peatlands was previously classified as grassland and other land cover types. In addition, our map highlights cultivated areas (e.g., river floodplains) that can be potentially rewetted. Such accurate and consistent mapping of different wetland types at a continental scale offers a baseline for future wetland monitoring and trend assessment, supports the detailed reporting of European carbon budgets, and lays down the foundation towards a global wetland inventory.

How to cite: Kovács, G. M., Oehmcke, S., Horion, S., Gominski, D., Tong, X., and Fensholt, R.: Satellite-based continental-scale inventory of European wetland types at 10m spatial resolution, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6696, https://doi.org/10.5194/egusphere-egu23-6696, 2023.

EGU23-8409 | ECS | Orals | ITS1.5/GI1.5

Evaluation of lagoon eutrophication potential under climate change conditions: A novel water quality machine learning and biogeochemical-based framework.

Federica Zennaro, Elisa Furlan, Donata Melaku Canu, Leslie Aveytua Alcazar, Ginevra Rosati, Sinem Aslan, Cosimo Solidoro, and Andrea Critto

Lagoons are highly valued coastal environments providing unique ecosystem services. However, they are fragile and vulnerable to natural processes and anthropogenic activities. Concurrently, climate change pressures, are likely to lead to severe ecological impacts on lagoon ecosystems. Among these, direct effects are mainly through changes in temperature and associated physico-chemical alterations, whereas indirect ones, mediated through processes such as extreme weather events in the catchment, include the alteration of nutrient loading patterns among others that can, in turn, modify the trophic states leading to depletion or to eutrophication. This phenomenon can lead, under certain circumstances, to harmful algal blooms events, anoxia, and mortality of aquatic flora and fauna, or to the reduction of primary production, with cascading effects on the whole trophic web with dramatic consequences for aquaculture, fishery, and recreational activities. The complexity of eutrophication processes, characterized by compounding and interconnected pressures, highlights the importance of adequate sophisticated methods to estimate future ecological impacts on fragile lagoon environments. In this context, a novel framework combining Machine Learning (ML) and biogeochemical models is proposed, leveraging the potential offered by both approaches to unravel and modelling environmental systems featured by compounding pressures. Multi-Layer Perceptron (MLP) and Random Forest (RF) models are used (trained, validated, and tested) within the Venice Lagoon case study to assimilate historical heterogenous WQ data (i.e., water temperature, salinity, and dissolved oxygen) and spatio-temporal information (i.e., monitoring station location and month), and to predict changes in chlorophyll-a (Chl-a) conditions. Then, projections from the biogeochemical model SHYFEM-BFM for 2049, and 2099 timeframes under RCP 8.5 are integrated to evaluate Chl-a variations under future bio-geochemical conditions forced by climate change projections. Annual and seasonal Chl-a predictions were performed out by classes based on two classification modes established on the descriptive statistics computed on baseline data: i) binary classification of Chl-a values under and over the median value, ii) multi-class classification defined by Chl-a quartiles. Results from the case study showed as the RF successfully classifies Chl-a under the baseline scenario with an overall model accuracy of about 80% for the median classification mode, and 61% for the quartile classification mode. Overall, a decreasing trend for the lowest Chl-a values (below the first quartile, i.e. 0.85 µg/l) can be observed, with an opposite rising fashion for the highest Chl-a values (above the fourth quartile, i.e. 2.78 µg/l). On the seasonal level, summer remains the season with the highest Chl-a values in all scenarios, although in 2099 a strong increase in Chl-a is also expected during the spring one. The proposed novel framework represents a valuable approach to strengthen both eutrophication modelling and scenarios analysis, by placing artificial intelligence-based models alongside biogeochemical models.

How to cite: Zennaro, F., Furlan, E., Melaku Canu, D., Aveytua Alcazar, L., Rosati, G., Aslan, S., Solidoro, C., and Critto, A.: Evaluation of lagoon eutrophication potential under climate change conditions: A novel water quality machine learning and biogeochemical-based framework., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8409, https://doi.org/10.5194/egusphere-egu23-8409, 2023.

EGU23-8702 | ECS | Orals | ITS1.5/GI1.5 | Highlight

Evaluating the risk of cumulative impacts in the Mediterranean Sea using a Random Forest model

Angelica Bianconi, Elisa Furlan, Christian Simeoni, Vuong Pham, Sebastiano Vascon, Andrea Critto, and Antonio Marcomini

Marine coastal ecosystems (MCEs) are of vital importance for human health and well-being. However, their ecological condition is increasingly threatened by multiple risks induced by the complex interplay between endogenic (e.g. coastal development, shipping traffic) and exogenic (e.g. changes in sea surface temperature, waves, sea level, etc.) pressures. Assessing cumulative impacts resulting from this dynamic interplay is a major challenge to achieve Sustainable Development Goals and biodiversity targets, as well as to drive ecosystem-based management in marine coastal areas. To this aim, a Machine Learning model (i.e. Random Forest - RF), integrating heterogenous data on multiple pressures and ecosystems’ health and biodiversity, was developed to support the evaluation of risk scenarios affecting seagrasses condition and their services capacity within the Mediterranean Sea. The RF model was trained, validated and tested by exploiting data collected from different open-source data platforms (e.g. Copernicus Services) for the baseline 2017. Moreover, based on the designed RF model, future scenario analysis was performed by integrating projections from climate numerical models for sea surface temperature and salinity under the 2050 and 2100 timeframes. Particularly, under the baseline scenario, the model performance achieved an overall accuracy of about 82%. Overall, the results of the analysis showed that the ecological condition and services capacity of seagrass meadows (i.e. spatial distribution, Shannon index, carbon sequestration) are mainly threatened by human-related pressures linked to coastal development (e.g. distance from main urban centres), as well as to changes in nutrient concentration and sea surface temperature. This result also emerged from the scenario analysis, highlighting a decrease in seagrass coverage and related services capacity, in both 2050 and 2100 timeframes. The developed model provides useful predictive insight on possible future ecosystem conditions in response to multiple pressures, supporting marine managers and planners towards more effective ecosystem-based adaptation and management measures in MCEs.

How to cite: Bianconi, A., Furlan, E., Simeoni, C., Pham, V., Vascon, S., Critto, A., and Marcomini, A.: Evaluating the risk of cumulative impacts in the Mediterranean Sea using a Random Forest model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8702, https://doi.org/10.5194/egusphere-egu23-8702, 2023.

EGU23-10681 | Orals | ITS1.5/GI1.5

EarthQA: A Question Answering Engine for Earth Observation Data Archives

Dharmen Punjani, Eleni Tsalapati, and Manolis Koubarakis

The standard way for earth observation experts or users to retrieve images from image archives (e.g., ESA's Copernicus Open Access Hub) is to use a graphical user interface, where they can select the geographical area of the image they are interested in and additionally they can specify some other metadata, such as sensing period, satellite platform and cloud cover.

In this work, we are developing the question-answering engine EarthQA that takes as input a question expressed in natural language (English) that asks for satellite images satisfying certain criteria and returns links to such datasets, which can be then downloaded from the CREODIAS cloud platform. To answer user questions, EarthQA queries two interlinked knowledge graphs: a knowledge graph encoding metadata of satellite images from the CREODIAS cloud platform (the SPARQL endpoint of CREODIAS) and the well-known knowledge graph DBpedia. Hence, the questions can refer to image metadata (e.g., satellite platform, sensing period, cloud cover), but also to more generic entities appearing in DBpedia knowledge graph (e.g., lake, Greece). In this way, the users can ask questions like “Find all Sentinel-1 GRD images taken during October 2021 that show large lakes in Greece having an area greater than 100 square kilometers”.

EarthQA follows a template-based approach to translate natural language questions into formal queries (SPARQL). Initially, it decomposes the user question by generating its dependency parse tree and then automatically disambiguates the components appearing in the question to elements of the two knowledge graphs. In particular, it automatically identifies the spatial or temporal entities (e.g., “Greece”, “October 2021”), concepts (e.g., “lake”), spatial or temporal relations (e.g., “in”, “during”), properties (e.g., “area”) and product types (e.g., “Sentinel-1 GRD”) and other metadata (e.g., “cloud cover below 10%”) mentioned in the question and maps them to the respective elements appearing in the two knowledge graphs (dbr:Greece, dbo:Lake, dbp:area, etc). After this, the SPARQL query is automatically generated.

How to cite: Punjani, D., Tsalapati, E., and Koubarakis, M.: EarthQA: A Question Answering Engine for Earth Observation Data Archives, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10681, https://doi.org/10.5194/egusphere-egu23-10681, 2023.

EGU23-11527 | ECS | Posters on site | ITS1.5/GI1.5

Global Layer——An integrated, fully online, cloud based platform

Xingchen Yang, Yang Song, Zhenhan Wu, and Chaowei Wu

In the current stage of scientific research, it is necessary to break the barriers between traditional disciplines and promote the cross integration of various related disciplines. As one of the important carriers of research achievements of various disciplines, maps can be superimposed and integrated to more intuitively display the results of multidisciplinary integration, promote the integration of disciplines and discover new scientific problems. Traditional geological mapping is often based on different scales for single scale mapping, aiming at the mapping mode of paper printing results. It is difficult to read maps between different scales at the same time. To solve this problem，an integrated platform named Global Layer is being built under the support of Deep-time Digital Earth (DDE). Global Layer is embedded with several core databases such as Geological Map of the World at a scale 1/5M, Global Geothermal Database etc. These databases presented in form of electronic map which enables the results of different scales to be displayed and browsed through one-stop hierarchical promotion. In addition, Users can also upload data in four ways: local file, database connection, cloud file and arcgis data service, and data or maping results can be shared to Facebook, Twitter and other platforms in the form of links, widgets, etc. Construction of Global Layer could provide experience and foundation for integrating global databases related to geological map and constructing data platforms.

How to cite: Yang, X., Song, Y., Wu, Z., and Wu, C.: Global Layer——An integrated, fully online, cloud based platform, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11527, https://doi.org/10.5194/egusphere-egu23-11527, 2023.

EGU23-12373 | ECS | Posters on site | ITS1.5/GI1.5

Mapping streams and ditches using Aerial Laser Scanning

Mariana Dos Santos Toledo Busarello, Anneli Ågren, and William Lidberg

Streams and ditches are seldom identified on current maps due to their small dimensions and sometimes intermittent nature. Estimates point out that only 9% of all ditches are currently mapped, and the underestimation of natural streams is a global issue. Ditches have been dug in European boreal forests and some parts of North America to drain wetlands and increase forest production, consequently boosting the availability of cultivable land and a national-scale landscape modification. Target 6.6 of the Agenda 2030 highlights the importance of protecting and restoring water-related ecosystems. Wetlands are a substantial part of this, having a high carbon storage capability, the property of mitigating floods, and purifying water. All things accounted for, the withdrawal of anthropogenic environment alterations can be on the horizon, even more because ditches are also strong emitters of methane and other greenhouse gases due to their anoxic water and sediment accumulation. However, streams and ditches that are missing from maps and databases are difficult to manage.

The main focus of this study was to develop a method to map channels combining deep learning and national Aerial Laser Scans (ALS). The performance of different topographical indices derived from the ALS data was evaluated, and two different Digital Elevation Model (DEM) resolutions were compared. Ditch channels and natural streams were manually digitized from ten regions across Sweden, summing up to 1923km of ditch channels and 248km of natural streams. The topographical indices used were: high-passing median filter, slope, sky-view factor and hillshade (with azimuths of 0°, 45°, 90° and 135°); while 0.5m and 1m were the DEM resolutions analysed. A U-net model was trained to segment images between ditches and stream channels: all pixels from each image were labelled in a way that those with the same class display similar attributes.

Results showed that ditches can be successfully mapped with this method and it can generally be applied anywhere since only local terrain indices are required. Additionally, when the natural streams are present in the dataset the model underperformed in predicting the location of ditches, while a higher resolution had the opposite effect. Streams were more challenging to map, and the model only indicated the channels, not whether or not they contained water. Further research will be required to combine hydrological modelling and deep learning.

How to cite: Dos Santos Toledo Busarello, M., Ågren, A., and Lidberg, W.: Mapping streams and ditches using Aerial Laser Scanning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12373, https://doi.org/10.5194/egusphere-egu23-12373, 2023.

EGU23-13099 | ECS | Posters virtual | ITS1.5/GI1.5

Mapping Swedish Soils with High-resolution DEM-derived Indices and Machine Learning

Yiqi Lin, William Lidberg, Cecilia Karlsson, and Anneli Ågren

There is a soaring demand for up-to-date and spatially-explicit soil information to address various environmental challenges. One of the most basic pieces of information, essential for research and decision-making in multiple disciplines is soil classification. Conventional soil maps are often low in spatial resolution and lack the complexity to be practical for hands-on use. Digital Soil Mapping (DSM) has emerged as an efficient alternative for its reproducibility, updatablity, accuracy, and cost-effectiveness, as well as the ability to quantify uncertainties.

Despite DSM’s growing popularity and increasingly wider areas of application, soil information is still rare in forested areas and remote regions, and the integration with high-resolution data on a country scale remains limited. In Sweden, quaternary deposit maps created by the Geological Survey of Sweden (SGU) have been the main reference input for soil-related research and operation, though most parts of the country still warrant higher quality representation. This study utilizes machine learning to produce a high-resolution surficial deposits map with nationwide coverage, capable of supporting research and decision-making. More specifically, it: i) compares the performance of two tree-based ensemble machine learning models, Extreme Gradient Boosting and Random Forest, in predictive mapping of soils across the entire country of Sweden; ii) determines the best model for spatial prediction of soil classes and estimates the associated uncertainty of the inferred map; iii) discusses the advantages and limitations of this approach, and iv) outputs a map product of soil classes at 2-m resolution. Similar attempts around the globe have shown promising results, though at coarser resolutions and/or of smaller geographical extent. The main assumptions behind this study are: i) terrain indices derived from digital elevation model (DEM) are useful predictors of soil type, though different classification algorithms differ in their effectiveness; ii) machine learning can capture major soil classes that cover most of Sweden, but expert geological and pedological knowledge is required when identifying rare soil types.

To achieve this, approximately 850,000 labeled soil points extracted from the most accurate SGU maps will be combined with a stack of 12 LiDAR DEM-derived topographic and hydrological indices and 4 environmental datasets. Uncertainty estimates of the overall model and for each soil class will be presented. An independent dataset obtained from the Swedish National Forest Soil Inventory will be used to assess the accuracy of the machine learning model. The presentation will cover the method, data handling, and some promising preliminary results.

How to cite: Lin, Y., Lidberg, W., Karlsson, C., and Ågren, A.: Mapping Swedish Soils with High-resolution DEM-derived Indices and Machine Learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13099, https://doi.org/10.5194/egusphere-egu23-13099, 2023.

EGU23-14519 | Orals | ITS1.5/GI1.5 | Highlight

From satellite data and Sustainable Development Goals to interactive tools and better territorial decision making

Maël Plantec and Fabien Castel

As human activities continue to expand and evolve, their impact on the planet is becoming more evident. These past years Murmuration has been studying one of the most recent and destructive trends that has taken off: mass tourism. In Malta, tourism has been on the rise since before the Covid-19 pandemic. Now that travel restrictions are beginning to lift, it's likely that this trend will go back to increasing in the coming years. While Malta’s economy is mostly based on tourism, it's essential that this activity does not alter the areas in which it takes place. To address these issues and ensure sustainable development, governments and organizations have developed a set of guidelines called Sustainable Development Goals (SDG). SDGs are a set of 17 goals adopted by the United Nations in 2015 to provide a framework to help countries pursue sustainable economic, social and environmental development. They include objectives for mitigating climate change, preventing water pollution and degradation of biodiversity, as well as providing economic benefits to local communities.

In order to help territories like the islands of Malta to cope with these environmental issues, Murmuration carries out studies on various ecological, human and economic indicators. Using the Sentinel satellites of the European Copernicus program for earth imagery data makes possible the collection of geolocated, hourly values on air quality indicators such as NO2, CO and other pollutants but also water quality and vegetation through the analysis of the vegetation health. Other data sources give access to land cover values at meter resolution, tourism infrastructures locations and many more human activity variables. This information is processed into understandable indicators, aggregated indexes which take international standards and SDGs in their design and usage. An example of these standards are the WHO air quality guidelines providing thresholds quantifying the impact on health of the air pollution in the area of interest. The last step is to gather all the data, maps and correlations computed and design understandable visualizations to make it usable by territory management instances, enabling efficient decision making and risk management. The goal here is to achieve a link between satellite imagery, internationally agreed political commitment and ground level decision-making.

This meaningful aggregation comes in the shape of operational dashboards. A dashboard is an up-to-date, interactive, evolving online tool hosting temporal and geographical linked visualizations on various indicators. This kind of tool allows for a better understanding of the dynamic of a territory in terms of environmental state, human impact and ecological potential.

How to cite: Plantec, M. and Castel, F.: From satellite data and Sustainable Development Goals to interactive tools and better territorial decision making, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14519, https://doi.org/10.5194/egusphere-egu23-14519, 2023.

EGU23-15656 | Posters virtual | ITS1.5/GI1.5

Karst integration into groundwater recharge simulation in WaterGAP

Wenhua Wan and Petra Döll

Karst aquifers cover a significant portion of the global water supply. However, a proper representation of groundwater recharge in karst areas is completely absent in the state-of-art global hydrological models. This study, based on the new version of the global hydrological model WaterGAP, (1) presented the first modeling of diffuse groundwater recharge (GWR) in all karst regions using the global map of karstifiable rocks; and (2) adjusted the current GWR algorithm with the up-to-date databases of slope and soil. A large number of ground-based recharge estimates on 818 half degree cells including 75 in karst areas were compared to model results. GWR in karst landscapes assuming equal to the runoff from soil leads to unbiased estimation. The majority of simulated mean annual recharge ranges from 0.6 mm/yr (10^th percentile) to 326.9 mm/yr (90^th) in nonkarst regions, and 7.5 mm/yr (10^th) to 740.2 mm/yr (90^th) in karst regions. The recharge rate ranges from 2% to 66% of precipitation according to ground-based estimates in karst regions, while the simulated GWR produces global recharge fractions between 4% (10^th) to 68% (90^th) in karst areas while that in nonkarst areas rarely exceeds 25%. Unlike the previous studies that claimed global hydrological models consistently underestimate recharge, we observed underestimation only in the very humid regions where recharge exceeds 300 mm/yr. These very high recharge estimates are likely to include preferential flow and adopt a finer spatial and temporal scale than the global model. In karst landscapes and arid regions, we demonstrate that WaterGAP incorporating karst algorithm gives a worthy performance.

How to cite: Wan, W. and Döll, P.: Karst integration into groundwater recharge simulation in WaterGAP, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15656, https://doi.org/10.5194/egusphere-egu23-15656, 2023.

EGU23-16252 | ECS | Posters on site | ITS1.5/GI1.5

GEOTEK: Extracting Marine Geological Data from Publications

Muhammad Asif Suryani, Christian Beth, Klaus Wallmann, and Matthias Renz

In Marine Geology, scientists persistently perform extensive experiments to measure diverse features across the globe, hence to estimate environmental changes. For example, Mass Accumulation Rate (MAR) and Sedimentation Rate (SR) are measured by marine geologists at various oceanographic locations and are largely reported in research publications but have not been compiled in any central database. Furthermore, every MAR and SR observation normally carries i) exact locational information (Longitude and Latitude), ii) the method of measurement (stratigraphy, 210Pb), iii) a numerical value and units (2.4 g/m²/yr), iv) temporal feature (e.g. hundred years ago). The contextual information attached to MAR and SR observations is heterogeneous and manual approaches for information extraction from text are infeasible. It is also worth mentioning that MAR and SR are not denoted in standard international (SI) units.

We propose the comprehensive end-to-end framework GEOTEK (Geological Text to Knowledge) to extract targeted information from marine geology publications. The proposed framework comprises three modules. The first module carries a document relevance model alongside a PDF extractor, capable of filtering relevant sources using metadata, and the extraction module extracts text, tables, and metadata respectively. The second module mainly comprises of two information extractors, namely Geo-Quantities and Geo-Spacy, particularly trained on text from the Marine Geology domain. Geo-Quantities is capable of extracting relevant numerical information from the text and covers more than 100 unit variants for MAR and SR, while Geo-Spacy extracts a set of relevant named entities as well as locational entities, which are further processed to obtain respective geocode boundaries. The third module, the Heterogeneous Information Linking module (HIL), processes exact spatial information from tables and captions and forms links to the previously extracted measurements. Finally, the all-linked information is populated in an interactive map view.

How to cite: Suryani, M. A., Beth, C., Wallmann, K., and Renz, M.: GEOTEK: Extracting Marine Geological Data from Publications, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16252, https://doi.org/10.5194/egusphere-egu23-16252, 2023.

EGU23-16813 | ECS | Posters on site | ITS1.5/GI1.5 | Highlight

The Use of Artificial Intelligence in ESA’s Climate Change Initiative

Anna Jungbluth, Ed Pechorro, Clement Albergel, and Susanne Mecklenburg

Climate change is arguably the greatest environmental challenge facing humankind in the twenty-first century. The United Nations Framework Convention on Climate Change (UNFCCC) facilitates multilateral action to combat climate change and its impacts on humanity and ecosystems. To make decisions on climate change mitigation and adaptation, the UNFCCC requires systematic observations of the global climate system.

The objective of the ESA’s climate programme, currently delivered via the Climate Change Initiative (CCI), is to realise the full potential of the long-term, global-scale, satellite earth observation archive that ESA and its Member States have established over the last 35 years, as a significant and timely contribution to the climate data record required by the UNFCCC.

Since 2010, the programme has contributed to a rapidly expanding body of scientific knowledge on >22 Essential Climate Variables (ECVs), through the production of Climate Data Records (CDRs). Although varying across geophysical parameters, ESA CDRs follow community-driven data standards, facilitating inter- and cross-ECV research of the climate system.

In this work, we highlight the use of artificial intelligence (AI) in the context of the ESA CCI. AI has played a pivotal role in the production and analysis of these Climate Data Records. Eleven CCI projects - Greenhouse Gases (GHG), Aerosols, Clouds, Fire, Ocean Colour, Sea Level, Soil Moisture, High Resolution Landcover, Biomass, Permafrost, and Sea Surface Salinity - have applied AI in their data record production and research or have identified specific AI usage for their research roadmaps.

The use of AI in these CCI projects is varied, for example - GHG CCI algorithms using random forest machine learning techniques; Aerosol CCI algorithms to retrieve dust aerosol optical depth from thermal infrared spectra; Fire CCI algorithms to detect burned areas. Moreover, the ESA climate community has identified climate science gaps in context to ECVs with the potential for meaningful advancement through AI.

We specifically focus on showcasing the use of AI for data homogenization and super-resolution of ESA CCI datasets. For instance, both the land cover and fire CCI dataset were generated globally in low resolution, while high resolution data only exists for specific geographical regions. By adapting super-resolution algorithms to the specific science use cases, we can accelerate the generation of global, high-resolution datasets with the required temporal coverage to support long-term climate studies.

How to cite: Jungbluth, A., Pechorro, E., Albergel, C., and Mecklenburg, S.: The Use of Artificial Intelligence in ESA’s Climate Change Initiative, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16813, https://doi.org/10.5194/egusphere-egu23-16813, 2023.

EGU23-4796 | Posters virtual | ITS1.15/ESSI2.18

Seabed substrate mapping using MBES data

Sanghun Son, Jaegu Bae, Doi Lee, So Ryeon Park, Jeong Min Seo, and Jinsoo Kim

Seafloor mapping is essential for effective management and sustainable development of marine resources. Various attempts have been made to map the seafloor using single beam echo sounders, multi beam echo sounders, and side scan sonars. The purpose of this study is to map the sea floor using backscatter and bathymetry based on multi-beam echo sounders. For seafloor mapping, seafloor cover was defined as rock, gravel, sand, and mud according to the folk structure, and 135 grab data were collected for seafloor mapping and accuracy evaluation. For seafloor mapping, bathymetry depth and depth-based secondary products (aspect, curvature, slope, roughness, eastness, northness, mean, standard deviation) and backscatter intensity and secondary products that can be produced from intensity (mean, variance, roughness) was established. In addition, the output of the GLCM algorithm (angular second moment, contrast, dissimilarity, energy, entropy, homogeneity, max, mean, standard deviation) was constructed to extract various features of backscatter intensity. For seafloor cover, a random forest model, a machine learning technique that shows high performance in various fields, was selected, and the ratio of training and test datasets was selected as 8:2. To improve the performance of the random forest model, a hyperparameter was selected by applying a 5-fold cross validation and grid-search method, and the overall accuracy was 0.83.

How to cite: Son, S., Bae, J., Lee, D., Park, S. R., Seo, J. M., and Kim, J.: Seabed substrate mapping using MBES data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4796, https://doi.org/10.5194/egusphere-egu23-4796, 2023.

EGU23-10770 | Posters on site | ITS1.15/ESSI2.18 | Highlight

EMODnet Geology – towards new standards on harmonizing marine geological data of the European seas - and beyond

Henry Vallius, Susanna Kihlman, Anu Kaskela, Aarno Kotilainen, Ulla Alanen, and EMODnet Geology Partners

High-quality maritime spatial planning, coastal zone management, management of marine resources, environmental assessments and forecasting require comprehensive understanding of the seabed. Already in 2008 and in response to these needs the European Commission established the European Marine Observation and Data Network (EMODnet). The EMODnet concept is to assemble existing but often fragmented and partly inaccessible marine information into harmonized, interoperable, and publicly freely available data layers encompassing whole marine basins. As the data products are free of restrictions on use, the program is supporting any European maritime activities in promotion of sustainable use and management of the European seas.

Now in its fourth phase, the EMODnet-Geology project is delivering integrated geological data products that include seabed substrates, sediment accumulation and seabed erosion rates, seafloor geology including lithology and stratigraphy, Quaternary geology and geomorphology, coastal behavior, geological events such as submarine landslides and earthquakes, marine mineral resources, as well as submerged landscapes of the European continental shelf at various time-frames. All new map products are presented at a scale of 1:100,000 all over or finer but also at coarser scales to ensure maximum areal coverage. Thus partner updates of single-scale products at 1:250,000 and 1:1,000,000 were encouraged and these data products have been uploaded when available. A multi-scale approach is adopted whenever possible.

The EMODnet Geology project is executed by a consortium of 39 partners and subcontractors which core is made up by 23 members of European geological surveys (Eurogeosurveys) backed up by 16 other partner organizations with valuable expertise and data.

The EMODnet concept is, however, not restricted to the European seas only, as also the Caspian and the Caribbean Seas are included in the geographical scope of the EMODnet Geology project, and selected methods were shared with the EMODnet PArtnership for China and Europe (EMOD-PACE) project (2019-2022).

Discover Europe’s seabed geology at: https://emodnet.ec.europa.eu/en/geology

How to cite: Vallius, H., Kihlman, S., Kaskela, A., Kotilainen, A., Alanen, U., and Geology Partners, E.: EMODnet Geology – towards new standards on harmonizing marine geological data of the European seas - and beyond, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10770, https://doi.org/10.5194/egusphere-egu23-10770, 2023.

EGU23-10791 | Posters on site | ITS1.15/ESSI2.18

Artificial Intelligence-Based Lithology Classification Using Sentinel-1 Data in Amurang, Sulawesi, Indonesia

Lorraine Tighe, Ir Ipranta, Rohit Singh, and Tony said

One of the biggest challenges temperate and tropical regions face is that dense forest covers much of the landscape, which can be problematic in lithological mapping. Synthetic Aperture Radar (SAR) data provides a window through heavily vegetated canopy and essential information about surface scattering that can be used to infer underlying lithology. This research proposes a new methodology for lithology classification based on Sentinel-1 SAR nested geospatial data and a hybrid Artificial Intelligence (AI) and Geographic Information Systems (GIS) technique. The purpose of this study is to demonstrate the ability of AI, GIS, and Sentinel-1 data to classify lithology in the heavy jungle of Amurang, Sulawesi, Indonesia. The results indicate the proposed method can accurately map 1:50,000 scale lithology and refine the Qv unit into young volcanic rocks (Qv) and young lava (Qvl) and further define the Qvl unit into three sub-units based on age where Qvls-1 is the younger and Qvls-3 is the older. Cross-validated results indicate our method identified lithology with an overall accuracy of 91.00%, a commission error rate of 3.03%, and an omission error rate of 2.15% compared to the 2006 X-band InSAR derived geological map of the Amurang, Sulawesi. The proposed method distinguishes and refines specific rock units and has the potential to semi-automate lithological mapping in heavily vegetated areas.

How to cite: Tighe, L., Ipranta, I., Singh, R., and said, T.: Artificial Intelligence-Based Lithology Classification Using Sentinel-1 Data in Amurang, Sulawesi, Indonesia, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10791, https://doi.org/10.5194/egusphere-egu23-10791, 2023.

EGU23-10960 | Posters on site | ITS1.15/ESSI2.18

Modelling intensity of brittle deformation in ice-covered regions: a case study in North Victoria Land (Antarctica)

Paola Cianfarra, Michele Locatelli, Alessio Bagnasco, Laura Crispini, Francesco Salvini, and Laura Federico

Remoteness and extreme environmental conditions characterize the North Victoria Land (NVL, Antarctica), located at the Pacific-Southern Ocean termination of the Transantarctic Mountains. Here only the 5% of the emerged land is ice free and available for direct geologic investigations.

Present knowledge of the NVL geotectonic setting derives from: i) geologic-structural data collected in the last decades from the sparse rock outcrops; ii) geophysical investigations performed in the framework of national and international scientific expeditions; iii) remote sensing analyses of radar and multi/hyperspectral data; and iv) integration of these multi-scale data.

Regionally sized, crustal scale faults crosscut the NVL from the Southern Ocean to the Ross Sea and represent inherited weakness zones that have been reactivated several times until Recent. These are both first-order faults, which separate crustal blocks (from W to E, the Wilson, Bowers, and Robertson Bay terranes), and second-order faults cutting through homogenous lithotectonic units. Due to the extensive ice cover, the real characteristics of these fault zones (e.g., geometry, thickness, persistence, locations of transfer zones and so possible associated fluid circulation) are still unclear, as well as the possible connections between the on-land and off-shore tectonic structures.

Here we present the intensity of brittle deformation distribution of an area of NVL where two main fault zones are supposed to interact (i.e., the Rennick and Aviator faults). The model map is derived by applying the parameter H/S, which quantifies the intensity of brittle deformation (H = fracture dimension and S = spacing among fractures belonging to the same azimuthal family; see Cianfarra et al. 2022).

The H/S map is derived from polymodal regression by full cubic surface of the mean normalized H/S. A total of 1224 H/S measurements from 113 sites were collected in NVL during the 2018 and 2021 PNRA campaigns in the framework of the G-IDEA and LARK PNRA-projects. The mean H/S for each site of field measurement was computed and then normalized by weighting the measured value by a factor proportional to the brittle strength of the various lithotypes (e.g., basalts-dolerites, well cemented sandstone-conglomerates, granites-migmatites, gneiss).

Preliminary results show: i) the presence of a relative maximum of the normalized mean H/S (Mt Jackman area) that could be linked to the Rennick and Aviator faults transfer zone; ii) a polymodal regression of the mean normalized H/S that matches the NNW-SSE orientation of the main regional mapped faults; iii) the increasing trend of the H/S in the northern area at the Pacific side of NVL suggesting a possible continuation and link between onshore and offshore tectonic structures (offshore investigations in NVL will be the target of the Authors in the next PNRA-BOOST 2023 Antarctic expedition).

The H/S map and its integration with remote observations and geophysical data represents a promising tool to locate ice-covered tectonic structures, define corridors of fracture damage zones and give new constrains for modelling any kind of fluid circulation.

Cianfarra et al. 2022, Tectonics 41, e2021TC007124, https://doi.org/10.1029/2021TC007124

How to cite: Cianfarra, P., Locatelli, M., Bagnasco, A., Crispini, L., Salvini, F., and Federico, L.: Modelling intensity of brittle deformation in ice-covered regions: a case study in North Victoria Land (Antarctica), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10960, https://doi.org/10.5194/egusphere-egu23-10960, 2023.

EGU23-12547 | ECS | Posters on site | ITS1.15/ESSI2.18

Geodatabase of structural data from North Victoria Land (Antarctica): a useful tool for geodynamic modelling

Alessio Bagnasco, Paola Cianfarra, Michele Locatelli, Laura Crispini, Evandro Balbi, Francesco Salvini, and Laura Federico

Here we present an open access GIS project associated to a structural database which includes the geo-structural measurements (over 6000) collected in the field during the past Italian PNRA (National Antarctic Research Program) scientific expeditions from the year 1988 to 2021. The targeted research area is the North Victoria Land (NVL), Antarctica, between the 70°-76° S latitude and the 159°-171° E longitude.

NVL is an area difficult to be accessed for direct geological studies on rocks due to the extensive ice/snow coverage (~5%) and few published studies with complete structural datasets are available so far.

Our database is organized in various fields which include: number/code of the expedition, date, code of the site of field structural measurement, geographical coordinates of the field measurement site, elevation, toponyms, lithological classification, geological unit, description of any collected sample, name of the field data collector, classification, attitude of measured structural element (strike/trend, dip/plunge, dip/plunge direction), local magnetic declination at the date of the field survey. Fault attributes include fault type (normal, reverse, strike-slip), rake of the slickenlines and sense of motion. Attributes of the extensional fractures/joints also include their dimension (height, H) and spacing (S).

Moreover, the GIS project includes basic georeferenced maps such as: i) geological maps of NVL available in literature at 1:250.000 and 1:500.000 scale; ii) DEM of the bedrock and of the ice surface (from Bedmap 2); iii) the Radarsat mosaic of Antarctica; and iv) the MODIS mosaic of Antarctica.

This queryable database allows to perform multiple geostatistical analyses and realise geothematic maps such as: i) the spatial variability of the main azimuthal structural trends at the regional scale; ii) the intensity of brittle deformation quantified by the H/S parameter (see contribution of Cianfarra et al. in this meeting); and iii) thematic geostructural maps (e.g: maps of the foliation traces, of strain partitioning or fractures distribution).

These analyses, pivotal to better understand the tectonic framework of complex regions such as the NVL and to provide constraints supporting any geodynamic modelling, will greatly benefit from the extreme pliability and interoperability of such a database, which can be easily modified and expanded according to different scientific research needs by the production of newly derived data.

How to cite: Bagnasco, A., Cianfarra, P., Locatelli, M., Crispini, L., Balbi, E., Salvini, F., and Federico, L.: Geodatabase of structural data from North Victoria Land (Antarctica): a useful tool for geodynamic modelling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12547, https://doi.org/10.5194/egusphere-egu23-12547, 2023.

EGU23-13602 | Posters on site | ITS1.15/ESSI2.18

Analysing the added value of surface features information in the Seabed substrate data from the European sea areas - EMODnet Geology

Susanna Kihlman, Anu Kaskela, Aarno Kotilainen, Ulla Alanen, Henry Vallius, and EMODnet Geology Partners

Increasing anthropogenic pressure in marine and coastal environments emphasizes the importance of the easily accessible, reliable, and suitable data on marine environment, to support conservation, research, and sustainable marine management decisions. The EMODnet (European Marine Observation and Data network) Geology project has been aiming to address this demand by collecting and harmonising geological data at different scales from all the European sea areas since 2009, at present with a collaboration of about 40 partners and subcontractors.

Seabed substrate data has been collected since the beginning of the EMODnet Geology project and it is one of the key elements shaping the physical structure of benthic habitats. In the project, national seabed substrate data is harmonised into a shared schema, based on the sediment grain size. However, there are some geologically and ecologically important seabed surface features, which cannot be explained only by grain size e.g., bioclastic features, moving sediment and FeMn concretion fields. Therefore, the project has also collected information on these features that partners have considered vital for the seabed environment. At best, this data could be a valuable addition to define e.g., geodiversity of the seabed environment when grain size distribution is insufficient.

The first review of the collected data aimed to identify and analyse the surface features, their occurrence and briefly discuss the prospects this additional information could provide. However, the development of a valuable surface features database requires further work, like developing guidelines concerning data collection methods, terminology, and classification. This work will need collaboration with different stakeholders and end users.

The EMODnet Geology project is funded by The European Climate, Environment and Infrastructure Executive Agency (CINEA) through contract EASME/EMFF/2020/3.1.11 - Lot 2/SI2.853812_EMODnet – Geology.

How to cite: Kihlman, S., Kaskela, A., Kotilainen, A., Alanen, U., Vallius, H., and Partners, E. G.: Analysing the added value of surface features information in the Seabed substrate data from the European sea areas - EMODnet Geology, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13602, https://doi.org/10.5194/egusphere-egu23-13602, 2023.

EGU23-15536 | Posters on site | ITS1.15/ESSI2.18

The Geological mapping of Iceland’s Insular Shelf and Adjacent Deep Ocean.

Ögmundur Erlendsson, Anett Blischke, Árni Hjartarson, Davíð Þ. Óðinsson, and Árni Þ. Vésteinsson

We present our contribution to the European Marine Observation and Data Network (EMODnet) and the first comprehensive marine geological seafloor map compilation for Icelandic waters across an area of 764,000 km². Our study is based on a variety of datasets, such as multi-beam and high-resolution bathymetry, sub-bottom profile and 2D seismic reflection, seafloor samples, and core data. This forms the basis for this map compilation, as well as previously published research. Mapping the seafloor geology of Icelandic waters is highly variable and challenging including volcanic, tectonic, sedimentary, and glacial features. These include e.g., present-day active and dormant volcanic systems, eruptive fissures and craters, seamounts and ridges, faults and lineaments, submarine lava borders, landslides, hydrothermal vents, terminal moraines, the extent of the last glacial maximum, glacial streamlines, drumlins, and gravity channels elements. Iceland´s onshore volcanic systems are well characterized based on their distribution of volcanic and tectonic fissures and rock compositions, which continue across the Icelandic insular shelf and the country´s marine domain. On the Icelandic insular shelf and shelf slopes, 17 active volcanic systems have been defined. Seamounts and Seamount ridges were mapped as isolated topographic features rising from the ocean floor that are typically volcanic and/or tectonic in origin. More than 600 craters and 250 eruptive fissures have been mapped and are common within active spreading zones or along extinct ridges. Subaerial and submarine lava flows, primarily seen as pillow lava sheets, have been mapped along the Reykjanes- and Kolbeinsey Ridges, craters, and eruptive fissures. Distinct submarine pillow lava flows can be seen deeper than 400 m depth with flow lengths up to 8-9 km from the crater of origin, and an aerial extent of 45-50 km². Tectonic elements, fault zones, or fissures are prominent along the active spreading zones, and common across the insular shelf all around Iceland. They follow the primary structural grain of the mid-oceanic ridges north and southwest of Iceland and are predominantly active normal fault systems that are accompanied by earthquakes. Near the rift axes, these faults can form 20 km long and up to 400 m high continuous fault escarpments. Submarine landslides around Iceland are found in the fjords of east and west Iceland, but some are located on the insular slopes and on the Iceland-Faroe Ridge. The ages of these landslides are inferred to be of prehistoric age (>1200 years B.C.) as coastal areas became unstable after the last glaciation. Glacial landforms and erosional marks have been mapped along the entire insular shelf. This includes moraine ridges and glacial streamlines that hold information about past glacial movements and behaviour. This marine geological map compilation for Icelandic waters provides vital data input and starting point for future research and mapping projects that require maps such as seabed substrate, seafloor geology, coastal behaviour, geological events and probabilities, minerals, and submerged landscape map coverages.

How to cite: Erlendsson, Ö., Blischke, A., Hjartarson, Á., Óðinsson, D. Þ., and Vésteinsson, Á. Þ.: The Geological mapping of Iceland’s Insular Shelf and Adjacent Deep Ocean., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15536, https://doi.org/10.5194/egusphere-egu23-15536, 2023.

EGU23-16050 | Posters on site | ITS1.15/ESSI2.18

Ocean mapping: Finding and compiling spatial data on extreme environments – key information even for a general mapping project such as EMODnet geology

Kristine Asch, Alexander Müller Müller, and Anett Blischke

Data and information on the ocean floor is hardly findable scattered, rarely compatible, often inaccessible, and often usable only by insiders. The main reason for this situation is the inaccessibility of the ocean floor and the need to use and rely on mostly geophysical methods in order to create a geological map. Therefore, the ocean floor is by far not as thoroughly explored as on-shore areas: “we have better maps of the surface of Mars and the Moon than we do of the bottom of the ocean.” [Gene Feldmann, NASA, 2009: https://www.nasa.gov/audience/forstudents/5-8/features/oceans-the-great-unknown-58.html].

Thus, in 2009 the European Commission established the European Marine Observation and Data Network (EMODnet) programme, subdivided into seven thematic projects, one of which is EMODnet Geology. It aims to build digitally available map layers of the European Seas to be interoperable and generally and freely available. Within the EMODnet Geology the workpackage “Seafloor geology” (lead by BGR) compiles and harmonizes marine geological and geomorphological data from the EMODnet partners all over Europe and adjacent areas, to be made available on the EMODnet Geology portal [https://emodnet.ec.europa.eu/en/geology] and the BGR portal [https://geoportal.bgr.de].

These data contain information on geomorphology, age, lithology and genesis (process, environment) of each unit and encompass two relevant aspects of extreme environmental mapping:

a) they are often mapped in extreme environments such as mid-oceanic ridges, rift propagation zone, and subsea volcanic centres, e.g. the Grimsey lineament rift propagation zone located north-of Iceland;

b) they contain information on past extreme environments, e.g. subglacial, volcanic or deep sea environments.

Underpinned by examples, this poster will present and discuss both aspects and outline the benefits of mapping in extreme environments also for general mapping projects such as EMODnet geology.

How to cite: Asch, K., Müller, A. M., and Blischke, A.: Ocean mapping: Finding and compiling spatial data on extreme environments – key information even for a general mapping project such as EMODnet geology, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16050, https://doi.org/10.5194/egusphere-egu23-16050, 2023.

EGU23-16497 | Posters on site | ITS1.15/ESSI2.18

Bridging “Around the world in 80 days” and “Journey to the Center of the Earth”: web-based mapping of exploration seismics data around the globe

Paolo Diviacco, Alessandro Busato, Mihai Burca, Alberto Viola, and Nikolas Potleca

Exploration Seismics is one of the most important geophysical methods that could provide insights of the Earth crust up to depths of several Kilometers. This approach has been used widely in many areas of the globe accumulating large datasets that allow to improve the knowledge of the Earth dynamics.

Providing access to recent and old datasets to the widest scientific community is of paramount importance to foster, as much as possible, collaborative research among scientists. In this, the possibility to find, preview and possibly process data directly on the web is extremely relevant.

National Institute of Oceanography and Applied Geophysics - OGS is deeply committed in developing a web-based framework named Seismic data Network Access Point (SNAP) (https://snap.ogs.trieste.it), that allows scientists to remotely explore data assets that have been acquired by OGS itself and by other research institutions. SNAP is used within several international data dissemination initiatives such as EMODnet, SeaDataNet, SCAR-SDLS and others.

These kinds of initiatives often focus on specific areas, such as for example the European Seas or Antarctica, that are located far from each other and that have different needs in terms of projections or bounding boxes. Finding a one-fits-all solution for web mapping and data access to georeferenced data in such diverse environments is not easy.

Polar areas, in particular, are as complex to handle as difficult to survey. At the same time these regions are of overwhelming importance for climate studies. The remoteness, extreme weather conditions, and environmental sensitivity of Antarctica make new data acquisition complicated and existing seismic data very valuable. It is, therefore, critical that existing data are Findable, Accessible, Interoperable and Reusable (FAIR). The aim of the SNAP framework and its implementations is to allow seismic data acquired in distant and different regions of the globe to be immediately accessible within a FAIR paradigm, offering all standard OGC compliant metadata models, and OGC compliant data access services.

We will present in detail the SNAP web-based framework in the light of Open Data and FAIR principles, and its planned future developments.

How to cite: Diviacco, P., Busato, A., Burca, M., Viola, A., and Potleca, N.: Bridging “Around the world in 80 days” and “Journey to the Center of the Earth”: web-based mapping of exploration seismics data around the globe, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16497, https://doi.org/10.5194/egusphere-egu23-16497, 2023.

ESSI3 – Open Science Informatics for Earth and Space Sciences

EGU23-1970 | ECS | Posters on site | ESSI3.1

A global, cross-disciplinary inventory of mountain monitoring infrastructure

James Thornton, Elisa Palazzi, and Carolina Adler

Multi-variate in situ observations from the Earth’s extensive mountain regions are crucial for a plethora of important applications, such as ground truthing remotely sensed data and downscaling climate model outputs. However, as a result of their inhospitable conditions and remoteness, mountains are extremely challenging environments in which to conduct systematic, long-term, and spatially dense in situ monitoring of bio-physical processes, leading to various real or perceived deficiencies in mountain data coverage (or “data gaps”). Gaining a thorough appreciation of these coverage deficiencies is complicated by the very heterogenous nature of the in situ mountain monitoring “landscape”; many different institutions and initiatives, often research-orientated as opposed to operational, employ a high diversity of techniques for a range of different applications. As such, it is currently extremely challenging for stakeholders to efficiently obtain an overview of who is measuring what, where, when, how, and why in a given region of interest. Information on the coverage of data beyond core climatic variables such as air temperature or precipitation is especially lacking. In this context, we present the GEO Mountains In Situ Inventory of Observational Infrastructure. The latest version (v2) contains key metadata for over 51,000 mountain monitoring stations, station networks, experimental basins, or other monitoring locations (e.g. repeated vegetation monitoring sites) across the world’s mountains. It can be viewed using a web mapping application, with the underlying table also available for direct download. Via a system that enables individuals or institutions to propose additions and improvements, we intend to continue developing the inventory in an iterative, community-based fashion. Overall, this effort should expedite access to the corresponding observations (e.g., time-series), reduce infrastructural redundancy, and improve interdisciplinary collaboration around existing sites. Once further expanded, the inventory may also facilitate more extensive and thematically broad data coverage analyses than those hitherto possible, which in turn could inform monitoring infrastructure installation and maintenance investment decisions. In conclusion, we will reflect on potential links between the inventory and a recently proposed set of Essential Mountain Climate Variables.

GEO Mountains (2022). Inventory of in situ mountain observational infrastructure, v2.0. https://www.geomountains.org/resources/resources-surveys/inventory-of-in-situ-observational-infrastructure. doi: 10.6084/m9.figshare.14899845.v2

How to cite: Thornton, J., Palazzi, E., and Adler, C.: A global, cross-disciplinary inventory of mountain monitoring infrastructure, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1970, https://doi.org/10.5194/egusphere-egu23-1970, 2023.

EGU23-6359 | ECS | Posters on site | ESSI3.1

G-reqs: How a user requirements system in GEO can improve the in-situ data availability?

Alba Brobia, Joan Maso, Marie-Françoise Voidrot, Ivette Serral, Alaitz Zabala, and Jose Miguel Rubio

The in-situ Earth Observation data segment is fragmented and there are significant data gaps to complete an observing system offering global datasets including series with relevant temporal depth. To identify the most urgent user needs, the InCASE project has designed a geospatial in-situ requirements database model, called G‑Reqs, aimed to collect and manage requirements emerging from the Group on Earth Observations (GEO) and the Copernicus community. The expected benefits include enabling a better reuse of in-situ data, enabling geographical upscale, identifying priorities in the needs and identifying communities with a common interest to look for synergies. Starting from the Essential Variables framework, the model offers a user-centric approach based on the expression of data needs and its translation into parametrized requirements for in-situ data. A first implementation was done in a web form and was tested by the EuroGEO community represented by volunteering pilots of the EU H2020 e-shape project, and it is open to research projects, decision-makers looking for policy indicators, remote sensing agencies in need of cal/val data, services produced by commercial companies, Earth system predictive algorithms and Machine Learning modellers, etc., interested in environmental in-situ data. The usefulness of the G-reqs model will lie in its capability to collect, share and analyse requirements, detect essential datasets, gaps, and help to make recommendations to data providers via a consensus process thus promoting the discovery of fit-for-purpose in-situ datasets. The consensus process can result in agreement on recommendations to data providers for producing products that cover emerging needs of the Earth Observation users’ community in terms of spatial, temporal coverage or quality target. In this context, the entire Earth Observation community of users is invited to use the G-reqs as a mechanism to document its in-situ data needs (https://g-reqs.grumets.cat). For example the in-situ networks of observation facilities (ENVRI, e.g. ELTER, GEOBON, among others) can then participate in the analysis, gap detection and recommendations for the creation of new products or modifications of the existing ones to better serve their users. With the G‑reqs as a tool, the In-Situ Data Working Group in GEO can act as a forum where the in-situ data barriers and gaps are discussed and addressed. This communication will present the requirements data model and the current status of the requirements collection as well as next steps to complete the G‑reqs capabilities. This work is inspired by the OSAAP (formerly NOSA) from NOAA, the World Meteorological Organization (WMO) OSCAR requirements database and the Copernicus In-Situ Component Information System (CIS²). The InCASE project is funded by the European Environment Agency (EEA) in the context of the EEA SLA on “Mainstreaming GEOSS Data Sharing and Management Principles in support of Europe’s Environment" in line with the European Strategy for Data, the Green Deal Data Space, and Destination Earth.

How to cite: Brobia, A., Maso, J., Voidrot, M.-F., Serral, I., Zabala, A., and Rubio, J. M.: G-reqs: How a user requirements system in GEO can improve the in-situ data availability?, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6359, https://doi.org/10.5194/egusphere-egu23-6359, 2023.

EGU23-9941 | ECS | Posters on site | ESSI3.1

NASA Harvest Platforms for Collecting and Sharing In-situ Observations of Essential Agricultural Variables

Michael Humber, Christina Justice, Blake Munshell, Mary Mitkish, Alyssa Whitcraft, Ian Jarvis, Florian Franziskakis, Esther Makabe, and Inbal Becker-Reshef

Agricultural monitoring has been an important topic in remote sensing research since the inception of satellite Earth observations. With early national-scale crop yield estimation efforts dating back to the LACIE and AgriSTARS projects of the 1970s and 1980s, the value and importance of food production, combined with the high variability of crop genotypes and phenotypes, have continued to spur constant innovation in mapping and monitoring agricultural land from remote sensing data.

Cropland is a highly dynamic surface type that can be difficult to map with high precision and accuracy for myriad reasons, including: variability in crop type and crop rotations; intra-season growth cycles; multi-cropping practices; crop health; fallow land; farming practices; soil fertility; seed genetics; environmental factors; and more. Each of these variables also changes through time due to climate change, advancing technology, and changes in socio-economic or political drivers. Mapping and modeling agricultural variables, therefore, require constant recalibration and validation against in-situ observations that are representative of the cropping regime.

The Essential Agricultural Variables (EAVs), defined by the GEO Global Agricultural Monitoring (GEOGLAM) initiative, provide a basis for identifying the data variables and their functional requirements in terms of spatial and temporal resolution. EAVs are designed to help the GEOGLAM community prioritize the development of products that can be derived from Earth observations data to improve downstream insight into agricultural productivity. At the same time, the EAVs are instructive for developing methods and tools for collecting the in-situ data needed to evaluate the corresponding products.

Operating under the GEOGLAM Data Lifecycle, the EAVs, and the GEO data sharing and management principles, the NASA Harvest consortium on food security and agriculture collects and distributes thousands of in-situ observations for public use in the agricultureal R&D domain. These efforts are underpinned by freely accessible data collection platforms, searchable data discovery and distribution portals, and purpose-driven field measurement methodologies that balance project-specific requirements while ensuring the future reusability of the dataset.

In this presentation, we highlight the status of current in-situ datasets and tools available through NASA Harvest. Their relevance is contextualized within the GEOGLAM EAV framework, and we discuss practical issues of in-situ data collection for agricultural remote sensing applications including farmer data privacy, reducing enumerator errors, coordinating data collection campaigns, limitations of reusing data, and balancing measurement complexity with general utility.

How to cite: Humber, M., Justice, C., Munshell, B., Mitkish, M., Whitcraft, A., Jarvis, I., Franziskakis, F., Makabe, E., and Becker-Reshef, I.: NASA Harvest Platforms for Collecting and Sharing In-situ Observations of Essential Agricultural Variables, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9941, https://doi.org/10.5194/egusphere-egu23-9941, 2023.

EGU23-11480 | Posters on site | ESSI3.1

New Resources promoting the GEO Data Sharing and Management, FAIR, and CARE principles

Marie-Francoise Voidrot, Bente Lijla Bye, Paola de Salvo, Florian Franziskakis, Karl Benedict, Joan Maso, Chris Schubert, Jose Miguel Rubio, and Robert R. Downs

In September and October 2022, the Group on Earth Observations (GEO) Data Working Group implemented a dialogue series to raise awareness about the GEO Data Sharing and Data

Management Principles and their benefits with special attention to usability and legal aspects of in-situ and remote sensing Earth Observation data. The intended dialogues’ audience included all Earth observation stakeholders, including data producers, technology providers, scientists, researchers, business developers, decision-makers, and policymakers. Each session covered the theory of the core topic and provided examples of existing implementations. Leaders and experts presented, via short lightning talks, success stories related to in situ and remote sensing data, the GEO Data Sharing and Data Management Principles, and the FAIR, TRUST, and CARE principles. The diversity of the success stories related to different thematic domains and data types, fostering cross-innovation, and highlighting proven solutions - all serving to introduce discussions. Members of the Earth observation community shared their experiences in implementing the principles, discussed how they tackled challenges and described impacts. The covered topics addressed the data life cycle, the GEO Data Sharing Principles, and the GEO Data Management Principles; elements of Discoverability, Accessibility, Usability (Encoding, Documentation, Provenance, Quality Control), Preservation (Preservation, Verification); Curation (Review and processing, Identifiers); as well as the Data Management Self-Assessment Tool. Standards are instrumental for implementing data management best practices. Broadly sharing a common baseline of understanding and references, the dialogues support community engagement and interoperability. The dialogue series contributes to capacity development on data management in general. All materials, including recordings and presentations, are available on the GEO Knowledge Hub, Youtube, and are open to sharing on other community portals. This communication will present examples of implementations of the GEO DMP extracted from the dialog series as well as new dialogues that will be developed in 2023, with special attention to in-situ data challenges and integration with other data in global solutions.

How to cite: Voidrot, M.-F., Lijla Bye, B., de Salvo, P., Franziskakis, F., Benedict, K., Maso, J., Schubert, C., Rubio, J. M., and Downs, R. R.: New Resources promoting the GEO Data Sharing and Management, FAIR, and CARE principles, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11480, https://doi.org/10.5194/egusphere-egu23-11480, 2023.

EGU23-16128 | Posters on site | ESSI3.1

Opening up FAIR in-situ land-use reference data: current gaps, obstacles and future challenges

Steffen Fritz, Catarina Barrasso, Steffen Ehrmann, Myroslava Lesiv, Ian McCallum, Carsten Meyer, Juan Laso Bayas, and Linda See

It is becoming increasingly obvious that in order to address current global challenges and achieve the SDGs in the land-use sector, monitoring and evaluation using remote sensing technologies are essential. In particular, with the Copernicus program of the European Union, unprecedented free and open Earth observation data are becoming available. However, in order to improve our remotely sensed based machine learning models, training data in the form of in-situ or annotated land-use or land cover data which are based on the visual interpretation of aerial photographs or very high resolution satellite data are of utmost importance. Without sufficient training data, many land-use and land cover maps lack sufficient quality.

The presentation will provide an overview of existing and open in-situ data in the field of land-use science. It will highlight what land-use data are currently available including data collected though crowdsourcing and the Geo-Wiki toolbox. In particular, it will provide insights into current gaps in land cover, land-use, livestock, forest as well as crop type information globally. It will draw on existing global data products such as those from the Copernicus global land monitoring service, and more recently generated products such as WorldCover and WorldCereal. Furthermore, tools to close those data gaps will be shown. The presentation will furthermore explore current obstacles and limitations to data sharing and debunk current arguments that are often put forth for not sharing in-situ data. These arguments include limited resources, quality issues, competition, as well as time constraints, etc. Specific attention will be given to the role of doners and funders in more clearly defining open and FAIR requirements for in-situ data. The presentation will close by making the audience aware of the LUCKINet consortium, which is trying to make more reference data openly accessible and to build a consistent global land-use change dataset as well as work done on in-situ data within the EU LAMASUS and OEMC project.

How to cite: Fritz, S., Barrasso, C., Ehrmann, S., Lesiv, M., McCallum, I., Meyer, C., Laso Bayas, J., and See, L.: Opening up FAIR in-situ land-use reference data: current gaps, obstacles and future challenges, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16128, https://doi.org/10.5194/egusphere-egu23-16128, 2023.

EGU23-16889 | Posters on site | ESSI3.1

A new release of deep ocean datasets for a better understanding of ocean dynamics

Nadia Lo Bue, Beatrice Giambenedetti, Davide Embriaco, Giuditta Marinaro, Paolo Bagiacchi, and Riccardo Vagni

Although it is now known that deep marine processes play a crucial role in the study and assessment of climate variability, deep-sea ocean dynamics remain largely unknown. This is due to the lack of data above 2000 m of depth.

Given the paucity of observation of the deep ocean environment, is essential to improve the availability and accessibility of in-situ high-quality datasets, standardized in accordance with the FAIR principles. This allows enhanced knowledge of local deep variability, and contributes to maximizing the utility of data, also supporting ocean modeling which so far has been unable to realistically depict the deep-layers state.

Several high sampling frequency dataset collected by benthic multidisciplinary observatories in key sites such as Mediterranean Sea, Marmara Sea and Atlantic Ocean has been elaborated, verifying the sensor efficiency through post-calibration and standardized Quality Control (QC) procedures, adapted from international protocols and recommendations specifically for deep-sea observations. QC tests were automated as much as possible aiming for a standardized procedure while taking into account the specificity of each different technology used.

The aim of this work is to disseminate verified in-situ dataset, together with properly formatted raw data and metadata, providing high-quality open-access data including Ocean Essential Variables ready for user analysis intended to fill the gap in deep ocean knowledge.

How to cite: Lo Bue, N., Giambenedetti, B., Embriaco, D., Marinaro, G., Bagiacchi, P., and Vagni, R.: A new release of deep ocean datasets for a better understanding of ocean dynamics, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16889, https://doi.org/10.5194/egusphere-egu23-16889, 2023.

EGU23-17465 | ECS | Posters on site | ESSI3.1

Quantification of evaluation methods of cultural ecosystem services using Statistical Database

Jayoo Seo and Chan Park

Of the four types of ecosystem services, Cultural Ecosystem Services (CES) is more difficult and underresearched than the others. Because people interact with nature in complex and diverse ways, CES cannot be explained as a direct benefit to people. Even at the national level, evaluation items and indicators for CES are not clear and common, so it is necessary to use data for a detailed review of indicators. This study aims to explore ways to apply quantitative data to CES assessments, thereby facilitating CES evaluations in national ecosystem service evaluations. We decided that, as one of these solutions, we could use the statistical data that the state had built. Among the data available nationally, national statistics are the most uniform and systematic. As for Korea's national statistics, a total of 191,215 kinds of statistics are open (as of June 2022) from 30 topics such as population and environment, central administrative agencies, local governments, financial institutions, public corporations and industrial complexes, research institutes, associations, and cooperatives. Only statistical data that were similar to the indicators of dual ecosystem service evaluation were extracted. This was then condensed into an indicator similar to CES. At this time, the difference between the CES indicator and other service indicators was based on 1) whether the CES indicator was covered in previous research, 2) whether it could act as an inclusive service such as local history, culture, religion, etc., and 3) whether the service benefits depended on human will. As a result, it was condensed into 100 statistics. Indicators of assessment found statistical data to assess landscape, health and healing, leisure recreation, and heritage, and no statistical data to assess educational value. Through this study, it is possible to quantify the evaluation method of cultural services, which is the most difficult to scientifically and quantitatively evaluate among ecosystem services. Through big data analysis, the evaluation status of international and national units can be checked, and the utilization of national statistical portals can be increased as objectified evaluation data. In addition, national statistical construction can be added with the information needed to achieve the country's comprehensive environmental plan and biodiversity goals.

How to cite: Seo, J. and Park, C.: Quantification of evaluation methods of cultural ecosystem services using Statistical Database, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17465, https://doi.org/10.5194/egusphere-egu23-17465, 2023.

EGU23-2941 | PICO | ESSI3.3

Modern Scientific Data Governance Framework

Rahul Ramachandran, Ge Peng, Shelby Bagwell, Abdelhak Marouane, Sumant Jha, and Jerika Christman

Science has entered the era of Big Data with new challenges related to data governance, stewardship, and management. The existing data governance practices must catch up to ensure proper data management. Existing data governance policies and stewardship best practices tend to be disconnected from operational data management practices and enforcement and mainly exist in well-meaning documents or reports. These governance policies are, at best, partially implemented and rarely monitored or audited. In addition, existing governance policies keep adding additional data management steps that require a human, ‘a data steward’, in the loop, and the cost of data management can no longer scale proportionately with the current and future increased data volume and complexity.

The goal for developing an updated data governance framework is to modernize scientific data governance to the reality of Big data and align it with the current technology trends such as cloud computing and AI. The goals of this framework are two folds. One is to ensure thoroughness that the governance adequately covers the entire data life cycle. Two, provide a practical approach that offers a consistent and repeatable process for different projects. Three core principles ground this framework. First, focus on just enough governance and prevent data governance from becoming a roadblock toward the scientific process. Remove any unnecessary processes and steps. Second, automate data management steps where possible. Actively remove steps that require ‘human in the loop’ within the management process to be efficient and scale with increasing data. Third, all the processes should continually be optimized using quantified metrics to streamline the monitoring and auditing workflows.

How to cite: Ramachandran, R., Peng, G., Bagwell, S., Marouane, A., Jha, S., and Christman, J.: Modern Scientific Data Governance Framework, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2941, https://doi.org/10.5194/egusphere-egu23-2941, 2023.

EGU23-3574 | PICO | ESSI3.3

Expansion of the NASA Astrophysics Data System to Earth and Space Sciences

Alberto Accomazzi and the ADS Team

The NASA Astrophysics Data System (ADS) is the primary Digital Library portal for Space Science Researchers. In addition to the scientific literature, the ADS has for a long time included in its database non-traditional scholarly resources such as research proposals, software packages, and high-level data products, making them discoverable and easily citable. Over the next three years, in response to NASA's efforts supporting interdisciplinary research and Open Science initiatives, the ADS will greatly expand its coverage of the literature, and will develop a new portal unifying access to the fields of Astrophysics, Planetary Science, Heliophysics, and Earth Science. It will also cover NASA funded research in Biological and Physical Sciences. The planned system will combine a scalable, discipline-agnostic core with a set of discipline specific knowledge centers which will curate and enrich its content using deep subject matter expertise from the NASA Science divisions. In this talk I will provide an overview of the ADS system, its distinguishing features, and then focus on our efforts to support and promote the FAIR principles as part of NASA's Year of Open Science initiatives.

How to cite: Accomazzi, A. and the ADS Team: Expansion of the NASA Astrophysics Data System to Earth and Space Sciences, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3574, https://doi.org/10.5194/egusphere-egu23-3574, 2023.

EGU23-5047 | PICO | ESSI3.3

WDCC - Improvement of FAIRness of an established repository

Eileen Hertwig, Andrea Lammert, Heinke Höck, Andrej Fast, and Hannes Thiemann

The World Data Center for Climate (WDCC) provides access to and offers long-term archiving for datasets relevant for climate and Earth System research in a highly standardized manner following the FAIR principles. The focus is on climate simulation data. The WDCC services are aimed at both scientists who produce data (e.g. to fulfill the guidelines of good scientific practice) and scientists who re-use published data for new research.

The WDCC is hosted by the German Climate Computing Center (DKRZ) in Hamburg, Germany. The repository is an accredited regular member of the World Data System (WDS) since 2003. WDCC is certified as a Trustworthy Data Repository by CoreTrustSeal (https://www.coretrustseal.org).

The WDCC was actively involved in the development of mechanisms to publish scientific datasets as citable entities. The first Datacite DOI ever assigned to a dataset was for a WDCC dataset in 2004 (http://dx.doi.org/10.1594/WDCC/EH4_OPYC_SRES_A2). Since then dataset collections in WDCC can be published with a DOI. In 2022, in compliance with the FAIR principles, the WDCC has also implemented the assignment of PIDs, persistent identifiers, for individual datasets. A PID is a long-lasting reference to a dataset (or other digital object) that is designed to always provide access to the object or to a representation of it, even if the actual URLs of the objects may change over time.

To meet user’s needs it is essential to ensure high quality of data, which means making sure that datasets in the repository are really Findable, Accessible, Interoperable, and Reusable (FAIR). The FAIRness of the WDCC has been systematically assessed in Peters-von Gehlen et al. (2022). Furthermore, to monitor the development of FAIRness in WDCC a FUJI-test is performed for all new dataset collections which are assigned a DOI.

Datasets are easier to find for the users when the corresponding metadata is machine-readable and a standardized vocabulary is used. The WDCC has implemented the schema.org standard, a machine-actionable metadata using JSON-LD format on the landing page of WDCC data publications. These embedded structured metadata in the landing page enhance interoperability across data catalogs and makes the data more discoverable.

WDCC actively participated in the AtMoDat project (https://www.atmodat.de/) and has started to publish datasets following the ATMODAT standard and with the EASYDAB label. The ATMODAT standard specifies requirements for rich metadata with controlled vocabularies, structured landing pages (human- and machine-readable), and the format and structure of the data files.

References:

Peters-von Gehlen, K., Höck, H., Fast, A., Heydebreck, D., Lammert, A. and Thiemann, H., 2022. Recommendations for Discipline-Specific FAIRness Evaluation Derived from Applying an Ensemble of Evaluation Tools. Data Science Journal, 21(1), p.7. DOI: http://doi.org/10.5334/dsj-2022-007

How to cite: Hertwig, E., Lammert, A., Höck, H., Fast, A., and Thiemann, H.: WDCC - Improvement of FAIRness of an established repository, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5047, https://doi.org/10.5194/egusphere-egu23-5047, 2023.

EGU23-6831 | PICO | ESSI3.3

Tracking and reporting peta-scale data exploitation within the Earth System Grid Federation through the ESGF Data Statistics service

Alessandra Nuzzo, Fabrizio Antonio, Maria Mirto, Paola Nassisi, Sandro Fiore, and Giovanni Aloisio

The Earth System Grid Federation (ESGF) is an international collaboration powering most global climate change research and managing the first-ever decentralized repository for handling climate science data, with multiple petabytes of data at dozens of federated sites worldwide. It is recognized as the leading infrastructure for the management and access of large distributed data volumes for climate change research and supports the Coupled Model Intercomparison Project (CMIP) and the Coordinated Regional Climate Downscaling Experiment (CORDEX), whose protocols enable the periodic assessments carried out by the IPCC, the Intergovernmental Panel on Climate Change.

As trusted international repository, ESGF hosts and replicates data from a broader range of domains and communities in the Earth sciences leading thus to a strong support to standards for connecting data and application of FAIR data principles to ensure free and open access and interoperability with other similar systems in the Earth Sciences.

ESGF includes a specific software component, funded by the H2020 projects IS-ENES2 and IS-ENES3, named ESGF Data Statistics, which takes care of collecting, analyzing, visualizing the data usage metrics and data archive information across the federation.

It provides a distributed and scalable software infrastructure responsible for capturing a set of metrics both at single site and federation level. It collects and stores a high volume of heterogeneous metrics, covering coarse and fine grain measures such as downloads and clients statistics, aggregated cross and project-specific download statistics thus offering a more user oriented perspective of the scientific experiments.

This allows providing a strong feedback on how much, how frequently and how intensively the whole federation is exploited by the end-users, as well as the most downloaded data, which somehow captures the level of interest from the community on some specific data. It also gives feedback on the less accessed data, which from one side can help designing larger-scale experiments in the future and on the other hand can help getting some insights on the long tail of research. On top of this, a view of the total amount of data published and available through ESGF offers users the possibility to monitor the status of the data archive of the entire federation.

This contribution presents an overview of the Data Statistics capabilities as well as the main results in terms of data analysis and visualization.

How to cite: Nuzzo, A., Antonio, F., Mirto, M., Nassisi, P., Fiore, S., and Aloisio, G.: Tracking and reporting peta-scale data exploitation within the Earth System Grid Federation through the ESGF Data Statistics service, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6831, https://doi.org/10.5194/egusphere-egu23-6831, 2023.

EGU23-8006 | PICO | ESSI3.3

Implementing FAIR metrics and assessments for the Earth and Environmental Sciences

Robert Huber, Christelle Pierkot, Marine Vernet, and Angelo Strollo

Although there has been great acceptance across disciplines for years to make research data FAIR (findable, accessible, interoperable, and reusable), there is still no consensus in most communities on how to implement this concretely on a discipline-specific basis. Available metrics are exclusively domain agnostic, and there are few approaches to formulate binding metrics and tests specifically for particular domains (e.g. earth and environmental science disciplines) and to implement them in assessment tools.

In this presentation, we will introduce new approaches being developed in the FAIR-IMPACT project, based on domain specific use case partners and their communities, including those from the earth and environmental sciences (e.g. collaboration with the communities involved in the FAIR-EASE project), to extend and thus adapt existing FAIR metrics for assessing data objects and the F-UJI FAIR Assessment Tool to more fully incorporate the disciplinary, 'geo' context. A particular focus here will be on incorporating geo-specific metadata standards, covering data formats and semantic artefacts within FAIR metrics, and the detection or verification of these standards by the F-UJI FAIR Assessment Tool.

We will finally report also on the collaboration with one of the EIDA Data Centers (as part of the European Infrastructure for seismic waveform data in EPOS) where the F-UJI FAIR Assessment Tool has been further developed to be aware of the very domain specific standard data and metadata as well as services.

How to cite: Huber, R., Pierkot, C., Vernet, M., and Strollo, A.: Implementing FAIR metrics and assessments for the Earth and Environmental Sciences, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8006, https://doi.org/10.5194/egusphere-egu23-8006, 2023.

EGU23-8170 | PICO | ESSI3.3

Tools to support climate researchers for long-tail research data in a FAIR context

Christian Pagé, Abel Aoun, Alessandro Spinuso, Klaus Zimmermann, and Lars Bärring

Doing high quality research involves complex workflows and intermediate datasets. An important part is also sharing of those datasets, software tools and workflows among researchers, and tracking provenance and lineage. It also needs to be stored in a citable permanent repository in order to be referenced in papers and reused subsequently by other researchers. Supporting this research data life cycle properly is a very challenging objective for research infrastructures. This is especially true with rapidly evolving technologies, sustainable funding problems and human expertise.

In the climate research infrastructure, many efforts have been made to support end-users and long tail research. There is the basic data distribution, the ESGF data nodes, but this is to support mainly specialized researchers in climate science. This basic infrastructure implements quite strict standards to enable proper data sharing in the research community. This is far from FAIR compliance, but this has proven to be extremely beneficial for collaborative research. Of course, high level components and services can be built on top. This is not an easy task, and a layered approach is always better to hide the underlying complexity and also to prevent technology locking and too complex codes. One example is the IS-ENES C4I 2.0 platform (https://dev.climate4impact.eu/ ), a front-end that eases very much data access, and is acting like a bridge between the data nodes and computing services. The C4I platform provides a very much enhanced Jupyter-la like interface (SWIRRL), with many services to support sharing of data and common workflow for data staging and preprocessing, as well as the development of new analysis methods in a research context. Advanced tools that can calculate end-user products are also made available along with some example notebooks implementing popular workflows. One of these tools is icclim (https://github.com/cerfacs-globc/icclim), a python software package. C4I also includes high-level services such as on-the-fly inter-comparisons between climate simulations with ESMValTool (https://github.com/ESMValGroup/ESMValTool). All this work is also including large efforts to standardize and to become closer to FAIR for data, workflows and software.

Another way of helping researchers is to pre-compute end-users products like climate indices. This is extremely useful for users because it can be really complex and time consuming to calculate those products. One example is to provide those users datasets of climate indices pre-computed on CMIP6 simulations would be very valuable for those users. Of course all specific needs cannot be taken into account but the most general ones can be fulfilled. The European Open Science Cloud (EOSC) is providing computing and storage resources through the EGI-ACE project, enabling the possibility to compute several climate indices. In this EGI-ACE Use Case, icclim will be used to compute 49 standard climate indices on a large number of CMIP6 simulations, starting with the most used ones. It could also be extended to ERA5 reanalysis, CORDEX and CMIP5 datasets.

This project (IS-ENES3) has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement N°824084.

How to cite: Pagé, C., Aoun, A., Spinuso, A., Zimmermann, K., and Bärring, L.: Tools to support climate researchers for long-tail research data in a FAIR context, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8170, https://doi.org/10.5194/egusphere-egu23-8170, 2023.

EGU23-10300 | ECS | PICO | ESSI3.3

Open and FAIR sample based data sharing through the IEDA2 facility

Lucia Profeta, Kerstin Lehnert, Peng Ji, Gokce Ustunisik, Roger Nielsen, Dave Vieglais, Douglas Walker, Karin Block, and Michael Grossberg

IEDA, the Interdisciplinary Earth Data Alliance, is a unique collaborative data infrastructure that provides and continuously evolves a comprehensive ecosystem of data, tools, and services that support researchers in the Geosciences to share and access sample data following the FAIR data principles and ensure open, reproducible, and transparent science practices.

The ‘next generation’ of IEDA - IEDA2 - was funded by the US National Science Foundation in 2022 for 5 years to advance the existing data systems and services of EarthChem (geochemistry data repository, data synthesis, and data access portals), LEPR/TraceDs (synthesis of data from petrological experiments), and SESAR (System for Earth Sample Registration), modernizing system architecture to better support computational and data-driven research, improving usability, and growing a diverse and inclusive user audience through education and engagement.

Our target is to enable a common understanding of existing information that is validated through expert data curation to enable transparent and reproducible (re)use and analysis of data, inform peer review, and guide future research directions. Such common understanding cannot develop unless the community can access and assess the same information.

This collective vision of the data will improve the ability of the entire community to assess the context and significance of new data and models, allow reviewers to evaluate new models that are calibrated using the data, and facilitate new generations of research.

How to cite: Profeta, L., Lehnert, K., Ji, P., Ustunisik, G., Nielsen, R., Vieglais, D., Walker, D., Block, K., and Grossberg, M.: Open and FAIR sample based data sharing through the IEDA2 facility, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10300, https://doi.org/10.5194/egusphere-egu23-10300, 2023.

EGU23-11512 | PICO | ESSI3.3

Access control and reporting at the Data Center for Deep Geothermal Energy

Marc Schaming, Alice Fremand, Mathieu Turlure, and Jean Schmittbuhl

The CDGP [https://cdgp.u-strasbg.fr], Data Center for Deep Geothermal Energy, is a Research repository for deep geothermal data in Alsace (France) originating from academic observatories or industrial providers. It collects seismological (catalogues, waveforms, focal mechanisms), seismic, hydraulic, geological, and all data related to anthropogenic hazard from the different phases of a geothermal project, mainly from the exploration and development stages at Soultz-sous-Forêts, Rittershoffen and Vendenheim EGS geothermal sites. Data are verified, validated, and curated; they are then described with metadata following the ISO 191115/19139 standards and have owner-defined distribution rules associated with them. A persistent Digital Object Identifier (DOI) is associated with the collections as well as a “how to cite” statement, for better citation. Data are converted into standard formats and archived in data warehouses. The CDGP is also a node of the EPOS TCS-AH platform [https://tcs.ah-epos.eu/] and provides Episodes metadata and data.

Metadata are open and harvested, they are also pushed on the TCS- AH platform. Data follow FAIR’s statement “as open as possible, as closed as necessary”. CDGP has set an authentication, authorization, and accounting infrastructure (AAAI) to compare distribution rules and user affiliation. Access to data on the TCS-AH platform is conditional on academic membership, and data are provided on-demand. Access controls are made as near the user as possible (subsidiarity principle). Monitoring and reporting are part of the AAAI, and usage is described in terms of use and privacy. Reports are sent to data providers every semester. CDGP operates a regular bibliographic follow-up, e.g. with Google Scholar or by questioning users that downloaded data. On his side, the TCS-AH can only provide general statistics on user engagement, origin, etc.

How to cite: Schaming, M., Fremand, A., Turlure, M., and Schmittbuhl, J.: Access control and reporting at the Data Center for Deep Geothermal Energy, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11512, https://doi.org/10.5194/egusphere-egu23-11512, 2023.

EGU23-13223 | ECS | PICO | ESSI3.3

Exploiting Curated, Domain-Specific Repositories to Facilitate Globally Interoperable Databases: the GEOROC Use-Case for Global Geochemical Data

Marthe Klöcking, Adrian Sturm, Bärbel Sarbas, Leander Kallas, Stefan Möller-McNett, Jens Nieschulze, Kerstin Lehnert, Kirsten Elger, Wolfram Horstmann, Daniel Kurzawe, Matthias Willbold, and Gerhard Wörner

The GEOROC database is a leading, open-access source of geochemical and isotopic datasets of igneous and metamorphic rocks and minerals. It was established 24 years ago and currently provides access to curated compilations of rock and mineral compositions from >20,600 publications (>32 million single data values). The Digital Geochemical Data Infrastructure (DIGIS) initiative for GEOROC 2.0 is now building a connected platform capable of supporting the diverse demands of digital, data-based geochemical research: including modern solutions to data submission, discovery and access.

One of the challenges for maintaining a high quality, up-to-date database such as GEOROC is consistent data entry. Historically, data were compiled manually from the academic literature by trained curators. This manual data entry process is slow, resource-intensive and prone to errors. Exacerbated by the lack of best-practices or standards for analytical geochemical data reporting, the quality and completeness of data and metadata compiled in this way are highly variable. A possible solution to this challenge is offered by domain-specific repositories: in part driven by demands of some funders and publishers to make all research data publicly available, data producers increasingly publish their research datasets, affording repositories a unique opportunity to impose consistent standards and quality. Following these developments, DIGIS established a domain repository with DOI minting capabilities in 2021 to support independent data submission by authors. In principle, these data submissions may comprise new analytical results as well as compilations of previously published data (“expert datasets”). DIGIS also uses its repository for versioning of the GEOROC data compilations and to provide distinct, citable objects to the researchers that use GEOROC compilations for their work (so-called “precompiled files”, a collection of pre-formatted results of the most popular search queries to the GEOROC database are regularly updated and re-published). However, whilst all data submissions by authors are required to fulfill the scope of the GEOROC database, new analytical data need to meet additional quality requirements: the repository enforces a strict template to ensure consistent reporting of all relevant sample and method/analysis metadata. These templates can then be automatically harvested from the repository directly into the GEOROC database, with the added guarantee that new data entries are a) approved by the owners of the datasets, and b) follow a consistent data reporting and quality standard.

To encourage user uptake of both the repository and the compilations available in the GEOROC database, DIGIS is working closely with IEDA2 and EarthChem towards developing a common infrastructure for geochemical data. One goal of this collaboration is a single repository submission platform that asserts the same requirements for data and metadata quality of submitted datasets. In addition, DIGIS has also partnered with GFZ Data Services as their trusted domain repository. Finally, through the OneGeochemistry initiative, all three partners are working towards global community-endorsed best practices for geochemical data publication. Ultimately, these efforts will facilitate greater interoperability between globally distributed geochemical data systems, enabling more user-friendly delivery of data publication and compilation services to the research community.

How to cite: Klöcking, M., Sturm, A., Sarbas, B., Kallas, L., Möller-McNett, S., Nieschulze, J., Lehnert, K., Elger, K., Horstmann, W., Kurzawe, D., Willbold, M., and Wörner, G.: Exploiting Curated, Domain-Specific Repositories to Facilitate Globally Interoperable Databases: the GEOROC Use-Case for Global Geochemical Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13223, https://doi.org/10.5194/egusphere-egu23-13223, 2023.

EGU23-13473 | PICO | ESSI3.3

Connecting the Long Tail: sharing and describing heterogeneous data via common metadata standards

Otto Lange, Laurens Samshuijzen, Kirsten Elger, Simone Frenzel, Ronald Pijnenburg, Richard Wessels, Geertje ter Maat, and Martyn Drury

The EPOS Multi-scale Laboratories (MSL) community includes a wide range of world-class solid Earth science laboratory infrastructures and as such it provides a multidisciplinary- and coherent platform for both virtual access to data and physical access to sophisticated research equipment. The MSL laboratories provide facilities for highly-specialized experimental research that results in experimental and analytical data underlying publications about phenomena ranging from the molecular to the continental scale.

From the perspective of the intended FAIRness of these laboratory data, the challenge for the MSL community has been to develop a data management paradigm that on one hand acknowledges the uniqueness of many of the data collections involved, and on the other hand maximizes their findability through metadata dissemination via common standards into larger cross-disciplinary communities. Furthermore, besides provenance information about the data themselves, harmonized information about research groups and experimental assets must be considered as increasingly important for feeding the network relations that may help in making sense of scientific impact.

As part of the MSL Data Publication Chain, the MSL community has developed a standardised workflow that allows easy metadata exchange based on common formats (e.g., flavors of DCAT-AP, DataCite 4.x, and ISO19115), whereas at the same time it integrates dedicated ontologies to give access to the richness of specialized terminology with respect to the MSL subdomains (e.g., analogue modelling, paleomagnetism, rock physics, geochemistry). Community developed controlled vocabularies act as the binding agent between data, equipment, and the experiment itself, while at the same time processing tools like a user-friendly metadata editor and a CKAN-based MSL data publication portal provide the building blocks for the chain towards cross-disciplinary sustainable dissemination.

We will demonstrate how the MSL data management paradigm exploits both the strength of controlled terminology and the availability of good agnostic common standards in an approach for managing heterogeneous data coming from long tail communities.

How to cite: Lange, O., Samshuijzen, L., Elger, K., Frenzel, S., Pijnenburg, R., Wessels, R., ter Maat, G., and Drury, M.: Connecting the Long Tail: sharing and describing heterogeneous data via common metadata standards, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13473, https://doi.org/10.5194/egusphere-egu23-13473, 2023.

EGU23-13514 | ECS | PICO | ESSI3.3

FAIR WISH project – developing metadata templates for IGSN Registration for various sample types

Mareike Wieczorek, Alexander Brauser, Birgit Heim, Simone Frenzel, Linda Baldewein, Ulrike Kleeberg, and Kirsten Elger

The International Generic Sample Number (IGSN) is a unique and persistent identifier for physical objects that was originally developed in the Geosciences. In 2022, after 10 years of service operation and more than 10 million registered samples worldwide, IGSN e.V. and DataCite have agreed on a strategic partnership. As a result, all IGSNs are now registered as DataCite DOIs and the IGSN metadata schema will be mapped to the DataCite Metadata Schema according to agreed guidelines. This will, on the one hand, enrich the very limited mandatory information shared by IGSN allocating agents so far. On the other hand, the DataCite metadata schema is not designed for the comprehensive description of physical objects and their provenance.

The IGSN Metadata Schema is modular: the mandatory Registration Schema only included information on the IGSN identifier, the minting agent and a date - complemented by the IGSN Description Schema (for data discovery) and additional extensions by the allocating agents to customise the sample description according to their sample’s subdomain.

Within the project “FAIR Workflows to establish IGSN for Samples in the Helmholtz Association (FAIR WISH)”, funded by the Helmholtz Metadata Collaboration Platform (HMC), we

(1) customised the GFZ-specific schema to describe water, soil and vegetation samples and

(2) support the metadata collection by the individual researcher with a user-friendly, easy-to-use batch registration template in MS Excel.

The information collected with the template can directly be converted to XML files (or JSON in the future) following the IGSN Metadata schema that is required to generate IGSN landing pages. The template is also the source for the generation of DataCite metadata.

The integration of linked data vocabularies (RDF, SKOS) in the metadata is an essential step in harmonising information across different research groups and institutions and important for the implementation of the FAIR Principles (Findable, Accessible, Interoperable, Reusable) for sample descriptions. More information on these controlled vocabularies can be found in the FAIR WISH D1 List of identified linked open data vocabularies to be included in IGSN metadata (https://doi.org/10.5281/zenodo.6787200).

The template to register IGSNs for samples should ideally fit to various sample types. In a first step, we created templates for samples from surface water and vegetation from AWI polar expeditions on land (AWI Use Case) and incorporated the two other FAIR WISH use cases with core material from the Ketzin coring site (Ketzin Use Case) and for a wide range of marine biogeochemical samples (Hereon Use Case). The template comprises few mandatory and many optional variables to describe a sample, the sampling activity, location and so on. Users can easily create their Excel-template, including only the variables needed to describe a sample. A tutorial on how to use the FAIR WISH: Sample description template (https://doi.org/10.5281/zenodo.7520016) can be found in the FAIR WISH D3 Video Tutorial for the FAIR SAMPLES Template (https://doi.org/10.5281/zenodo.7381390). As our registration template is still a work in progress, we are furthermore happy for user feedback (https://doi.org/10.5281/zenodo.7377904).

Here we will present the template and discuss its applicability for sample registration.

How to cite: Wieczorek, M., Brauser, A., Heim, B., Frenzel, S., Baldewein, L., Kleeberg, U., and Elger, K.: FAIR WISH project – developing metadata templates for IGSN Registration for various sample types, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13514, https://doi.org/10.5194/egusphere-egu23-13514, 2023.

EGU23-14871 | PICO | ESSI3.3

Because it matters: Benefits of using the domain repository GFZ Data Services for Earth System Sciences data

Florian Ott, Kirsten Elger, and Simone Frenzel

Implementing the FAIR principles is becoming more and more relevant for the scientific community and in particular for the Earth System Sciences. It is already widely acknowledged that research data inhibit not only relevant information for their respective field of study but can open new avenues of research when appropriately re-used and/or combined with other information. At this point research data repositories represent key data access points and especially domain repositories with careful data curation and enrichment of metadata are increasingly relevant and valuable.

GFZ Data Services, hosted at the GFZ German Research Centre for Geosciences (GFZ), is a domain repository for geosciences data that assigns digital object identifier (DOI) to data and scientific software since 2004 and is Allocating Agent for the International Generic Sample Number IGSN, the globally unique persistent identifier for physical samples. The repository, on one hand, provides DOI minting services for several global monitoring networks/observatories in geodesy and geophysics (e.g. INTERMAGNET; IAG Services ICGEM, IGETS, IGS; GEOFON), collaborative projects (TERENO, EnMAP, GRACE, CHAMP) and has a strong focus long-tail data on the other hand. All metadata and data are curated by domain scientist.

Especially the provision of (i) comprehensive domain-specific standardised and machine-actionable metadata with linked-data vocabularies used in the geosciences domain and (ii) comprehensive technical data descriptions or DOI-referenced data reports complementing the metadata, result in high-quality data publications that are easily discoverable across domains. Furthermore, the provision of cross-references through persistent identifiers (DOI, IGSN, ORCID, Fundref, ROR) to related research products (text, data, software, people and institutions) further increase the visibility and interoperability of research data.

Next to curation workflows for data and metadata realised by domain experts, GFZ Data Services offers detailed and thorough user guidance via its website (https://dataservices.gfz-potsdam.de). This website is the central information and access point for the repository and provides the data and sample catalogues, information on metadata, data formats, the data publication workflow, FAQ, links to different versions of our metadata editor, downloadable data description templates and general information on data management practices.

How to cite: Ott, F., Elger, K., and Frenzel, S.: Because it matters: Benefits of using the domain repository GFZ Data Services for Earth System Sciences data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14871, https://doi.org/10.5194/egusphere-egu23-14871, 2023.

EGU23-229 | ECS | Posters virtual | ESSI3.5

A web-based strategy to reuse grids in geographic modeling

Yuanqing He, Min Chen, Yongning Wen, and Songshan Yue

Integrated application of geo-analysis models is critical for geo-process research. Due to the continuity of the real world, the geo-analysis model cannot be applied immediately over the entire space. To date, the method of regrading space as a sequence of computing units (i.e. grid) has been widely used in geographic study. However, the model's variances in division algorithms result in distinct grid data structures. At first, researchers must install and setup the various software to generate the structure-specific grid data required by the models. This method of localized processing is inconvenient and inefficient. Second, in order to integrate the models that use different structural grid data, researchers need to design a specific conversion method based on the integration scenario. Due to difference of researcher’s development habits, it is difficult to reuse the conversion method in another runtime environment. The open and cross-platform character of web services enables users to generate data without the assistance of software programs. It has the potential to revolutionize the present time-consuming process of grid generation and conversion, hence increasing efficiency.

Based on the standardized model encapsulation technology proposed by OpenGMS group, this paper presents a grid-service method tailored to the specific requirements of open geographic model integration applications, and the research work is carried out in the following three areas:

The basic strategy of grid servitization. The heterogeneity of the grid generation method is a major factor that prevents it from being invoked via a unified way by web services. To reduce the heterogeneous of the grid generation method, this study proposes a standardized description method based on the Model Description Language (MDL).
Method for constructing a grid data generating service. A unified representation approach for grid data is proposed in order to standardize the description of heterogeneous grid data; an encapsulation method for grid generating algorithms is proposed; and grid-service is realized by merging the main idea of grid servitization.
Method for constructing a grid data conversion service . A box-type grid indexing approach is provided to facilitate the retrieval of grid cells with a large data volume; two conversion types, topologically similar and topologically inaccessible grid data conversion, are summarized, along with the related conversion procedures. On this foundation, a grid conversion engine is built using the grid service-based strategy as a theoretical guide and integrated with the grid conversion strategy.

Based on the grid service approach proposed in this paper, researchers can generate and converse grid data without tedious steps for downloading and installing programs. Thus, there are more time spend on geography problem solving, hence increasing efficiency.

How to cite: He, Y., Chen, M., Wen, Y., and Yue, S.: A web-based strategy to reuse grids in geographic modeling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-229, https://doi.org/10.5194/egusphere-egu23-229, 2023.

EGU23-2744 | Orals | ESSI3.5

Provenance powered microservices: a flexible and generic approach fostering reproducible research in Earth Science

Alessandro Spinuso, Ian van der Neut, Mats Veldhuizen, Christian Pagé, and Daniele Bailo

Scientific progress requires research outputs to be reproducible, or at least persistently traceable and analysable for defects through time. This can be facilitated by coupling analysis tools that are already familiar to scientists, with reproducibility controls designed around common containerisation technologies and formats to represent metadata and provenance. Moreover, modern interactive tools for data analysis and visualisation, such as computational notebooks and visual analytics systems, are built to expose their functionalities through the Web. This facilitates the development of integrated solutions that are designed to support computational research with reproducibility in mind, and that, once deployed onto a Cloud infrastructure, benefit from operations that are securely managed and perform reliably. Such systems should be able to easily accommodate specific requirements concerning, for instance, the deployment of particular scientific software and the collection of tailored, yet comprehensive, provenance recordings about data and processes. By decoupling and generalising the description of the environment where a particular research took place from the underlying implementation, which may become obsolete through time, we obtain better chances to recollect relevant information for the retrospective analysis of a scientific product in the long term, enhancing preservation and reproducibility of results.

In this contribution we illustrate how this is achievable via the adoption of microservice architectures combined with a provenance model that supports metadata standards and templating. We aim at empowering scientific data portals with Virtual Research Environments (VREs) and provenance services, that are programmatically controlled via high-level functions over the internet. Our system SWIRRL deals, on behalf of the clients, with the complexity of allocating the interactive services for the VREs on a Cloud platform. It runs staging and preprocessing workflows to gather and organise remote datasets, making them accessible collaboratively. We show how Provenance Services manage provenance records about the underlying environment, datasets and analysis workflows, and how these are exploited by researchers to control different reproducibility use cases. Our solutions are currently being implemented in more contexts in Earth Science. We will provide an overview on the progress of these efforts for the EPOS and IS-ENES research infrastructures, addressing solid earth and climate studies, respectively.

Finally, although the reproducibility challenges can be tackled to a large extent by modern technology, this will be further consolidated and made interoperable via the implementation and uptake of the FDOs. To achieve this goal, it is fundamental to establish the conversation between engineers, data-stewards and researchers early in the process of delivering a scientific product. This fosters the definition and implementation of suitable best practices to be adopted by a particular research group. Scientific tools and repositories built around modern FAIR enabling resources can be incrementally refined thanks to this mediated exchange. We will briefly introduce success stories towards this goal in the context of the IPCC Assessment Reports.

How to cite: Spinuso, A., van der Neut, I., Veldhuizen, M., Pagé, C., and Bailo, D.: Provenance powered microservices: a flexible and generic approach fostering reproducible research in Earth Science, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2744, https://doi.org/10.5194/egusphere-egu23-2744, 2023.

EGU23-3006 | Posters virtual | ESSI3.5

How AuScope 3D Geomodels Portal integrates relatively metadata poor geological models into its metadata infrastructure

Vincent Fazio

The AuScope 3D Geomodels Portal is a website designed to display a variety of geological models and associated datasets and information from all over the Australian continent. The models are imported from publicly available sources, namely Australian government geological surveys and research organisations. Often the models come in the form of downloadable file packages designed to be viewed in specialised geological software applications. They usually contain enough information to view the model’s structural geometry, datasets and a minimal amount of geological textual information. Seldom do they contain substantial metadata, often they were created before the term ‘FAIR’ was coined or the importance of metadata had dawned upon many of us. This creates challenges for data providers and aggregators trying to maintain a certain standard of FAIR compliance across all their offerings. How to improve the standard of FAIR compliance of metadata extracted from these models? How to integrate these models into existing metadata infrastructure? For the Geomodels portal, these concerns are alleviated within the automated model transformation software. This software transforms the source file packages into a format suitable for display in a modern WebGL compliant browser. Owing to the nature of the model source files only a very modest amount of metadata can be extracted. Hence other sources of metadata must be introduced. For example, often the dataset provider will publish a downloadable PDF report file or a description on a web page associated with the model. Automated textual analysis is used to extract more information from these sources. At the end of the transformation process, an ISO-compliant metadata record is created for importing into a geonetwork catalogue. The geonetwork catalogue record can be used for integration with other applications. For example, AuScope’s flagship portal, the AuScope Portal displays information, download links and a geospatial footprint of models on a map. The metadata can also be displayed in the Geomodels Portal.

How to cite: Fazio, V.: How AuScope 3D Geomodels Portal integrates relatively metadata poor geological models into its metadata infrastructure, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3006, https://doi.org/10.5194/egusphere-egu23-3006, 2023.

EGU23-3711 | Orals | ESSI3.5

Lessons in FAIR software from the Community Surface Dynamics Modeling System

Gregory Tucker, Albert Kettner, Eric Hutton, Mark Piper, Tian Gan, Benjamin Campforts, Irina Overeem, and Matthew Rossi

The Community Surface Dynamics Modeling System (CSDMS) is a US-based science facility that supports computational modeling of diverse Earth and planetary surface processes, ranging from natural hazards and contemporary environmental change to geologic applications. The facility promotes open, interoperable, and shared software. Here we review approaches and lessons learned in advancing FAIR principles for geoscience modeling. To promote sharing and accessibility, CSDMS maintains an online Model Repository that catalogs over 400 shared codes, ranging from individual subroutines to large and sophisticated integrated models. Thanks to semi-automated search tools, the Repository now includes ~20,000 references to literature describing these models and their applications, giving prospective model users efficient access to information about how various codes have been developed and used. To promote interoperability, CSDMS develops and promotes the Basic Model Interface (BMI): a lightweight, language-agnostic API standard that provides control, query, and data-modification functions. BMI has been adopted by a number of academic, government, and quasi-private institutions for coupled-modeling applications. BMI specifications are provided for common scientific languages, including as Python, C, C++, Fortran, and Java. One challenge lies in broader awareness and adoption; for example, self-taught code developers may be unaware of the concept of an API standard, or may not perceive value in designing around such a standard. One way to address this challenge is to provide open-source programming libraries. One such library that CSDMS curates is Landlab Toolkit: a Python package that includes building blocks for model development (such as grid data structures and I/O functions) while also providing a framework for assembling integrated models from component parts. We find that Landlab can greatly speed model development, while giving user-developers an incentive to follow common patters and contribute new components to the library. However, libraries by themselves do not solve the reproducibility challenge. Rather than reinventing the wheel, the CSDMS facility has approached reproducibility by partnering with the Whole Tale initiative, which provides tools and protocols to create reproducible archives of computational research. Finally, we have found that a central challenge to FAIR modeling lies in the level of community knowledge. FAIR is a two-way street that depends in part on the technical skills of the user. Are they fluent in a particular programming language? How familiar are they with the numerical methods used by a given model? How familiar are they with underlying scientific concepts and simplifying assumptions? Are they conversant with modern version control and collaborative-development technology and practices? Although scientists should not need to become software engineers, in our experience there is a basic level of knowledge that can substantially raise the quality and sustainability of research software. To address this, CSDMS offers training programs, self-paced learning materials, and online help resources for community members. The vision is to foster a thriving community of practice in computational geoscience research, equipped with ever-improving modeling tools written by and for the community as a whole.

How to cite: Tucker, G., Kettner, A., Hutton, E., Piper, M., Gan, T., Campforts, B., Overeem, I., and Rossi, M.: Lessons in FAIR software from the Community Surface Dynamics Modeling System, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3711, https://doi.org/10.5194/egusphere-egu23-3711, 2023.

EGU23-4354 | Posters on site | ESSI3.5

Going FAIR by the book: Accelerating the adoption of PID-enabled good practices in software communities through reference publication.

Peter Löwe

This is a report from the chapter editor's perspective of a high visibility publication effort to foster the adoption of the FAIR principles (Findable, Accessible, Interoperable, Reusable) by encouraging the adoption of Persistent Identifiers (PID) and repository-based workflows in geospatial open source software communities as good practices. Lessons learned are detailed about how to communicate the benefits of PID adoption to software project communities focussing on professional software-development and meritocracy. Also encountered communication bottleneck patterns, the significance of cross-project multiplicators, remaining challenges and emerging opportunities for publishers and repository infrastructures are reported. For the second Edition of the Springer Handbook of Geographic Information, a team of scientific domain experts from several software communities was tasked to rewrite a chapter about Open Source Geographic Information Systems (DOI: 10.1007/978-3-030-53125-6_30). For this, a sample of representative geospatial open source projects was selected, based on the range of projects integrated in the OSGeo live umbrella project (DOI: 10.5281/zenodo.5884859). The chapters authors worked in close contact with the respective Open Source software project communities. Since the editing and production process for the Handbook of Geographic Information was delayed due to the pandemic, this provided the opportunity to explore, improve and implement good practices for state of the art PID-based citation of software projects and versions, but also project communities, data and related scientific video ressources. This was a learning process for all stakeholders involved in the publication project. At the completion of the project, the majority of the involved software projects had minted Digital Object Identifiers (DOI) for their codebases. While the adoption level of software versioning with automated PID-generation and metadata quality remains heterogeneous, the insights gained from this process can simplify and accelerate the adoption of PID-based best software community practices for other open geospatial projects according to the FAIR principles.

How to cite: Löwe, P.: Going FAIR by the book: Accelerating the adoption of PID-enabled good practices in software communities through reference publication., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4354, https://doi.org/10.5194/egusphere-egu23-4354, 2023.

EGU23-4525 | ECS | Posters on site | ESSI3.5

GCIMS – Integration: Reproducible, robust, and scalable workflows for interoperable human-Earth system modeling

Zarrar Khan, Chris Vernon, Isaac Thompson, and Pralit Patel

The number of models, as well as data inputs and outputs, are continuously growing as scientists continue to push the boundaries of spatial, temporal, and sectoral details being captured. This study presents the framework being developed to manage the Global Change Intersectoral Modeling System (GCIMS) eco-system of human-Earth system models. We discuss the challenges of ensuring continuous deployment and integration, reproducibility, interoperability, containerization, and data management for the growing suite of GCIMS models. We investigate the challenges of model version control and interoperability between models using different software, operating on different temporal and spatial scales, and focusing on different sectors. We also discuss managing transparency and accessibility to models and their corresponding data products throughout our integrated modeling lifecycle.

How to cite: Khan, Z., Vernon, C., Thompson, I., and Patel, P.: GCIMS – Integration: Reproducible, robust, and scalable workflows for interoperable human-Earth system modeling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4525, https://doi.org/10.5194/egusphere-egu23-4525, 2023.

EGU23-4939 * | Orals | ESSI3.5 | Highlight

Open Science: How Open is Open?

Shelley Stall and Kristina Vrouwenvelder

EGU23-6375 | Posters on site | ESSI3.5

A machine-actionable workflow for the publication of climate impact data of the ISIMIP project

Jochen Klar and Matthias Mengel

The Inter-Sectoral Impact Model Intercomparison Project (ISIMIP) is a community-driven climate impact modeling initiative that aims to contribute to a quantitative and cross-sectoral synthesis of the various impacts of climate change, including associated uncertainties. ISIMIP is organized into simulation rounds for which a simulation protocol defines a set of common scenarios. Participating modeling groups run their simulations according to these scenarios and with a common set of climatic and socioeconomic input data. The model output data are collected by the ISIMIP team at the Potsdam Institute for Climate Impact Research (PIK) and made publicly available in the ISIMIP repository. Currently the ISIMIP Repository at data.isimip.org includes data from over 150 impact models spanning across 13 different sectors. It comprises of over 100 Tb of data.

As the world's largest data archive of model-based climate impact data, ISIMIP output data is used by a very diverse audience inside and outside of academia, for all kind of research and analyses. Special care is taken to enable persistent identification, provenience, and citablity. A set of workflows and tools ensure the conformity of the model output data with the protocol and the transparent management of caveats and updates to already published data. Datasets are referenced using unique internal IDs and hash values are stored for each file in the database.

In recent years, this process has been significantly improved by introducing a machine-readable protocol, which is version controlled on GitHub and can be accessed over the internet. A set of software tools for quality control and data publication accesses this protocol to enforce a consistent data quality and to extract metadata. Some of the tools can be used independently by the modelling groups even before submitting the data. After the data is published on the ISIMIP Repository, it can be accessed via web or using an API (e.g. for access from Jupyter notebooks) using the same controlled vocabularies from the protocol. In order to make the data citable, DOI for each output sector are registered with DataCite. For each DOI, a precise list of each contained dataset is maintained. If data for a sector is added or replaced, a new, updated DOI is created.

While the specific implementation is highly optimized to the peculiarities of ISIMIP, the general ideas should be transferable to other projects. In our presentation, we will discuss the various tools and how they interact to create an integrated curation and publishing workflow.

How to cite: Klar, J. and Mengel, M.: A machine-actionable workflow for the publication of climate impact data of the ISIMIP project, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6375, https://doi.org/10.5194/egusphere-egu23-6375, 2023.

EGU23-6726 | ECS | Posters on site | ESSI3.5

Data compilations for enriched reuse of sea ice data sets

Anna Simson, Anil Yildiz, and Julia Kowalski

A vast amount of in situ cryospheric data has been collected during publicly funded field campaigns to the polar regions over the past decades. Each individual data set yields important insights into local thermo-physical processes, but they need to be assembled into informative data compilations to unlock their full potential to produce regional or global outcomes for climate change related research. The efficient and sustainable interdisciplinary reuse of such data compilations is of large interest to the scientific community. Yet, the creation of such compilations is often challenging as they have to be composed of often heterogeneous data sets from various data repositories. We will focus on the reuse of data sets in this contribution, while generating extendible data compilations with enhanced reusability.

Data reuse is typically conducted by researchers other than the original data producers, and it is therefore often limited by the metadata and provenance information available. Reuse scenarios include the validation of physics-based process models, the training of data-driven models, or data-integrated predictive simulations. All these use cases heavily rely on a diverse data foundation in form of a data compilation, which depends on high quality information. In addition to metadata, provenance, and licensing conditions, the data set itself must be checked for reusability. Individual data sets containing the same metrics often differ in structure, content, and metadata, which challenges data compilation.

In order to generate data compilations for a specific reuse scenario, we propose to break down the workflow into four steps:
1) Search and selection: Searching, assessing, optimizing search, and selecting data sets.
2) Validation: Understanding and representing data sets in terms of the data collectors including structure, terms used, metadata, and relations between different metrics or data sets.
3) Specification: Defining the format, structure, and content of the data compilation based on the scope of the data sets.
4) Implementation: Integrating the selected data sets into the compilation.

We present a workflow herein to create a data compilation from heterogeneous sea ice core data sets following the previously introduced structure. We report on obstacles encountered in the validation of data sets mainly due to missing or ambiguous metadata. This leaves the (re)user space for subjective interpretation and thus increases uncertainty of the compilation. Examples are challenges in relating different data repositories associated with the same location or the same campaign, the accuracy of measurement methods, and the processing stage of the data. All of which often require a bilateral iteration with the data acquisition team. Our study shows that enriching data reusability with data compilations requires quality-ensured metadata on the individual data set level.

How to cite: Simson, A., Yildiz, A., and Kowalski, J.: Data compilations for enriched reuse of sea ice data sets, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6726, https://doi.org/10.5194/egusphere-egu23-6726, 2023.

EGU23-7417 | ECS | Posters on site | ESSI3.5 | Highlight

Data-integrated executable publications for reproducible geohazards research

Anil Yildiz and Julia Kowalski

Investigating the mechanics of physical processes involved in various geohazards, e.g. gravitational, flow-like mass movements, shallow landslides or flash floods, predicting their temporal or spatial occurrence, and analysing the associated risks clearly benefit from advanced computational process-based or data-driven models. Reproducibility is needed not only for the integrity of the scientific results, but also as a trustbuilding element in practical geohazards engineering. Various complex numerical models or pre-trained machine learning algorithms exist in the literature, for example, to determine landslide susceptibility in a region or to predict the run-out of torrential flows in a catchment. These use FAIR datasets with increasing frequency, for example DEM data to set up the simulation, or open access landslide databases for training and validation purposes. However, we maintain that workflow reproducibility is not ensured simply due to the FAIRness of input or output datasets. Underlying computational or machine learning model needs to be (re)structured to enable the reproducibility and replicability of every step in the workflow so that a model can be (re)built to either reproduce the same results, or can be (re)used to elaborate on new cases or new applications. We propose a data-integrated, platform-independent scientific model publication approach combining self-developed Python packages, Jupyter notebooks, version controlling, FAIR data repositories and high-quality metadata. Model development in the form of a Python package guarantees that model can be run by any end-user, and defining submodules of analysis or visualisation within the package helps the users to build their own models upon the model presented. Publishing the manuscript as a data- and model-integrated Jupyter notebook creates a transparent application of the model, and the user can reproduce any result either presented in the manuscript or in the datasets. We demonstrate our workflow with two applications from geohazards research herein while highlighting the shortcomings of the existing frameworks and suggesting improvements for future applications.

How to cite: Yildiz, A. and Kowalski, J.: Data-integrated executable publications for reproducible geohazards research, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7417, https://doi.org/10.5194/egusphere-egu23-7417, 2023.

EGU23-7427 | Posters on site | ESSI3.5

Integrating sample management and semantic research-data management in glaciology

Florian Spreckelsen, Henrik tom Wörden, Daniel Hornung, Timm Fitschen, Alexander Schlemmer, and Johannes Freitag

The flexible open-source research data management toolkit CaosDB is used in a diversity of fields such as turbulence physics, legal research, maritime research and glaciology. It is used to link research data and make it findable and retrievable and to keep it consistent, even if the data model changes.

CaosDB is used in the glaciology department at the Alfred Wegener Institute in Bremerhaven for the management of ice core samples and related measurements and analyses. Researchers can use the system to query for ice samples linked to, e.g., specific measurements for which they then can request to borrow for further analyses. This facilitates inter-laboratory collaborative research on the same samples. The system helped to solve a number of needs for the researchers, such as: A revision system which intrinsically keeps track of changes to the data and in which state samples were, when certain analyses were performed. Automated gathering of information for the publication in a meta-data repository (Pangaea). Tools for storing, displaying and querying geospatial information and graphical summaries of all the measurements and analyses performed on an ice core. Automatic data extraction and refinement into data records in CaosDB so that users do not need to enter the data manually. A state machine which guarantees certain workflows, simplifies development and can be extended to trigger additional actions upon transitions.

We demonstrate how CaosDB enables researchers to create and work with semantic data objects. We further show how CaosDB's semantic data structure enables researchers to publish their data as FAIR Digital Objects.

How to cite: Spreckelsen, F., tom Wörden, H., Hornung, D., Fitschen, T., Schlemmer, A., and Freitag, J.: Integrating sample management and semantic research-data management in glaciology, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7427, https://doi.org/10.5194/egusphere-egu23-7427, 2023.

EGU23-7532 | Posters on site | ESSI3.5

Virtual Earth Cloud: a multi-cloud framework for improving replicability of scientific models

Mattia Santoro, Paolo Mazzetti, and Stefano Nativi

Humankind is facing unprecedented global environmental and social challenges in terms of food, water and energy security, resilience to natural hazards, etc. To address these challenges, international organizations have defined a list of policy actions to be achieved in a relatively short and medium-term timespan (e.g., the UN SDGs). The development and use of knowledge platforms is key in helping the decision-making process to take significant decisions and avoid potentially negative impacts on society and the environment.

Scientific models are key tools to transform into information and knowledge the huge amount of data currently available online. Executing a scientific model (implemented as an analytical software) commonly requires the discovery and use of different types of digital resources (i.e. data, services, and infrastructural resources). In the present geoscience technological landscape, these resources are generally provided by different systems (working independently from one another) by utilizing Web technologies (e.g. Internet APIs, Web Services, etc.). In addition, a given scientific model is often designed and developed for execution in a specific computing environment. These are important barriers to enable reproducibility, replicability, and reusability of scientific models –becoming key interoperability requirements for a transparent decision-making process.

This presentation introduces the Virtual Earth Cloud concept, a multi-cloud framework for the generation of information/knowledge from Big Earth Data analytics. The Virtual Earth Cloud allows the execution of computational models to process and extract knowledge from Big Earth Data, in a multi-cloud environment, and thus improving their reproducibility, replicability and reusability.

The development and prototyping of the Virtual Earth Cloud is carried out in the context of the GEOSS Platform Plus (GPP) project, funded by the European Union’s Horizon 2020 Framework Programme, aims to contribute to the implementation of the Global Earth Observation System of Systems (GEOSS) by evolving the European GEOSS Platform components to allow access to tailor-made information and actionable knowledge.

How to cite: Santoro, M., Mazzetti, P., and Nativi, S.: Virtual Earth Cloud: a multi-cloud framework for improving replicability of scientific models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7532, https://doi.org/10.5194/egusphere-egu23-7532, 2023.

EGU23-8321 | Orals | ESSI3.5

Facilitating provenance documentation with a model-driven-engineering approach.

Lucy Bastin, Owen Reynolds, Antonio Garcia-Dominguez, and James Sprinks

Evaluating the quality of data is a major concern within the scientific community: before using any dataset for study, a careful judgement of its suitability must be conducted. This requires that the steps followed to acquire, select, and process the data have been thoroughly documented in a methodical manner, in a way that can be clearly communicated to the rest of the community. This is particularly important in the field of citizen science, where a project that can clearly demonstrate its protocols, transformation steps, and quality assurance procedures have much more chance of achieving social and scientific impact through the use and re-use of its data.

A number of specifications have been created to provide a common set of concepts and terminology, such as ISO 19115-3 or W3C PROV. These define a set of interchange formats, but in themselves, they do not provide tooling to create high-quality dataset descriptions. The existing tools built on these standards (e.g. GeoNetwork, USGS metadata wizard, CKAN) are overly complex for some users (for example, many citizen science project managers) who, despite being experts in their own fields, may be unfamiliar with the structure and context of metadata standards or with semantic modelling.

In this presentation, we will describe a prototype authoring tool that was created using a Model-driven engineering (MDE) software development methodology. The tool was authored using JetBrains Meta Programming System (MPS) to implement a modelling language based on the ISO19115-3 model. A user is provided with a “text-like” editing environment, which assists with the formal structures needed to produce a machine-parable document.

This allows a user to easily describe data lineage and generic processing steps while reusing recognised external vocabularies with automated validation, autocompletion, and transformation to external formats (e.g. the XML format 19115-3 or JSON-LD). We will report on the results of user testing aimed at making the tool accessible to citizen scientists (through dedicated projections with simplified structures and dialogue-driven model creation) and evaluating with those users any new possibilities that comprehensive and machine-parsable provenance information may create for data integration and sharing. The prototype will also serve as a test pilot of the integration between ISO 19115-3 and existing/upcoming third-party vocabularies (such as the upcoming ISO data quality measures registry).

How to cite: Bastin, L., Reynolds, O., Garcia-Dominguez, A., and Sprinks, J.: Facilitating provenance documentation with a model-driven-engineering approach., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8321, https://doi.org/10.5194/egusphere-egu23-8321, 2023.

EGU23-8526 | ECS | Orals | ESSI3.5

openEO Platform – showcasing a federated, accessible platform for reproducible large-scale Earth Observation analysis

Benjamin Schumacher, Patrick Griffiths, Edzer Pebesma, Jeroen Dries, Alexander Jacob, Daniel Thiex, Matthias Mohr, and Christian Briese

openEO Platform holds a large amount of free and open as well as commercial Earth Observation (EO) data which can be accessed and analysed with openEO, an open API that enables cloud computing and EO data access in a unified and reproducible way. Additionally, client libraries are available in R, Python and Javascript. A JupterLab environment and the Web Editor, a graphical interface, allow a direct and interactive development of processing workflows. The platform is developed with a strong user focus and various use cases have been implemented to illustrate the platform capabilities. Currently, three federated backends support the analysis of EO data from pixel to continental scale.

The use cases implemented during the platform’s main development phase include a dynamic landcover mapping, an on-demand analysis-ready-data creation for Sentinel-1 GRD, Sentinel-2 MSI and Landsat data, time series-based forest dynamics analysis with prediction functionalities, feature engineering for crop type mapping and large-scale fractional canopy mapping. Additionally, three new use cases are being developed by platform users. These include large scale vessel detection based on Sentinel-1 and Sentinel-2 data, surface water indicators using the ESA World Water toolbox for a user-defined area of interest and monitoring of air quality parameters using Sentinel-5P data.

The future evolution of openEO Platform in terms of data availability and processing capabilities closely linked to community requirements, facilitated by feature requests from users who design their workflows for environmental monitoring and reproducible research purposes. This presentation provides an overview of the completed use cases, the newly added functionalities such as user code sharing, and user interface updates based on the new use cases and user requests. openEO Platform exemplifies how the processing and analysing large amounts of EO data to meaningful information products is becoming easier and largely compliant with FAIR data principles supporting the EO community at large.

How to cite: Schumacher, B., Griffiths, P., Pebesma, E., Dries, J., Jacob, A., Thiex, D., Mohr, M., and Briese, C.: openEO Platform – showcasing a federated, accessible platform for reproducible large-scale Earth Observation analysis, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8526, https://doi.org/10.5194/egusphere-egu23-8526, 2023.

EGU23-9852 | Posters on site | ESSI3.5

Proposal of a simple procedure to derive a more FAIR open data archive than a spreadsheet or a set of CSV files

Filippo Giadrossich, Ilenia Murgia, and Roberto Scotti

NuoroForestrySchool (a study center of the Department of Agriculture, University of Sassari, Italy) has developed and published a ‘data documentation procedure’ (link to NFS-DDP) enabling the improvement of the dataset FAIRness that any data collector wishes to share as open data. Datasets are frequently shared as spreadsheet files. While this tool is very handy in data preparation and preliminary analysis, its structure and composition are not very effective for storing and sharing consolidated data, unless data structures are extremely simple. NFS-DDP takes in input a spreadsheet in which data are organized as relational tables, one per sheet, while four additional sheets contain metadata standardized according to the Dublin Core specifications. The procedure outputs an SQLite relational database (including data and metadata) and a pdf-file documenting the database structure and contents. A first example application of the proposed procedure was shared by Giadrossich et al. (2022) on the PANGEA repository, concerning experimental data of erosion in forest soil measured during artificial rainfall. The zip-archive that can be downloaded contains the experiment data and metadata processed by NFS-DDP. At the following link is available a test document where basic statistics are computed to show how NFS-DDProcedure facilitates the understanding and correct processing of the shared dataset.

The NFS-DataDocumentationProcedure provides a simple solution for organizing and archiving data aiming to i) achieve a more FAIR archive, ii) exploit data consistency and comprehensibility of semantic connections in the relational database, ii) produce a report documenting the collection and organization of data, providing an effective and concise overview of the whole with all details at hand.

Giadrossich, F., Murgia, I., Scotti, R. (2022). Experiment of water runoff and soil erosion with and without forest canopy coverage under intense artificial rainfall. PANGAEA. DOI:10.1594/PANGAEA.943451

How to cite: Giadrossich, F., Murgia, I., and Scotti, R.: Proposal of a simple procedure to derive a more FAIR open data archive than a spreadsheet or a set of CSV files, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9852, https://doi.org/10.5194/egusphere-egu23-9852, 2023.

EGU23-12443 | Posters on site | ESSI3.5 | Highlight

Landlab: a modeling platform that promotes the building of FAIR research software

Eric Hutton and Gregory Tucker

Landlab is an open-source Python package designed to facilitate creating, combining, and reusing 2D numerical models. As a core component of the Community Surface Dynamics Modeling System (CSDMS) Workbench, Landlab can be used to build and couple models from a wide range of domains. We present how Landlab provides a platform that fosters a community of model developers and aids them in creating sustainable and FAIR (Findable, Accessible, Interoperable, Reusable) research software.

Landlab’s core functionality can be split into two main categories: infrastructural tools and community-contributed components. Infrastructural tools address the common needs of building new models (e.g. a gridding engine, and numerical utilities for common tasks). Landlab’s library of community-contributed components consists of several dozen components that each model a separate physical process (e.g. routing of shallow water flow across a landscape, calculating groundwater flow, or biologic evolution over a landscape). As these user-contributed components are incorporated into Landlab, they are able to attach to the Landlab infrastructure so that they also become both findable and accessible (through, for example, standardized metadata and versioning) and are maintained by the core Landlab developers.

One key aspect of Landlab’s design is its use of a standard programming interface for all components. This ensures that all Landlab components are interoperable with one another and with other software tools, allowing researchers to incorporate Landlab's components into their own workflows and analyses. By separating processes into individual components, they become reusable and allow researchers to combine components in new ways without having to write new components from scratch.

Overall, Landlab's design and development practices support the principles of FAIR research software, promoting the ability for scientific research to be easily shared and built upon. This design also provides a platform onto which model developers are able to attach their model components and take advantage of Landlab’s development practices and infrastructure and ensure their components also follow FAIR principles.

How to cite: Hutton, E. and Tucker, G.: Landlab: a modeling platform that promotes the building of FAIR research software, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12443, https://doi.org/10.5194/egusphere-egu23-12443, 2023.

EGU23-12864 | Orals | ESSI3.5 | Highlight

Who Done It? Reproducibility of Data Products Also Requires Lineage to Determine Impact and Give Credit Where Credit is Due.

Lesley Wyborn, Nigel Rees, Jens Klump, Ben Evans, Rebecca Farrington, and Tim Rawling

Reproducible research necessitates full transparency and integrity in data collection (e.g. from observations) or generation of data, and further data processing and analysis to generate research products. However, Earth and environmental science data are growing in complexity, volume and variety and today, particularly for large-volume Earth observation and geophysics datasets, achieving this transparency is not easy. It is rare for a published data product to be created in a single processing event by a single author or individual research group. Modern research data processing pipelines/workflows can have quite complex lineages, and it is more likely that an individual research product is generated through multiple levels of processing, starting from raw instrument data at full resolution (L0) followed by successive levels of processing (L1-L4), which progressively convert raw instrument data into more useful parameters and formats. Each individual level of processing can be undertaken by different research groups using a variety of funding sources: rarely are those involved in the early stages of processing/funding properly cited.

The lower levels of processing are where observational data essentially remains at full resolution and is calibrated, georeferenced and processed to sensor units (L1) and then geophysical variables are derived (L2). Historically, particularly where the volumes of the L0-L2 datasets are measured in Terabytes to Petabytes, processing could only be undertaken by a minority of specialised scientific research groups and data providers, as few had the expertise/resources/infrastructures to process them on-premise. Wider availability of colocated data assets and HPC/cloud processing means that the full resolution, less processed forms of observational data can now be processed remotely in realistic timeframes by multiple researchers to their specific processing requirements, and also enables greater exploration of parameter space allowing multiple values for the same inputs to be trialled. The advantage is that better-targeted research products can now be rapidly produced. However, the downside is that far greater care needs to be taken to ensure that there is sufficient machine-readable metadata and provenance information to enable any user to determine what processing steps and input parameters were used in each part of the lineage of any released dataset/data product, as well as be able to reference exactly who undertook any part of the acquisition/processing and identify sources of funding (including instruments/field campaigns that collected the data).

The use of Persistent Identifiers (PIDs) for any component objects (observational data, synthetic data, software, model inputs, people, instruments, grants, organisations, etc.) will be critical. Global and interdisciplinary research teams of the future will be reliant on software engineers to develop community-driven software environments that aid and enhance the transparency and reproducibility of their scientific workflows and ensure recogniton. The advantage of the PID approach is that not only will reproducibility and transparency be enhanced, but through the use of Knowledge Graphs it will also be possible to trace the input of any researcher at any level of processing, whilst funders will be able to determine the impact of each stage from the raw data capture through to any derivative high-level data product.

How to cite: Wyborn, L., Rees, N., Klump, J., Evans, B., Farrington, R., and Rawling, T.: Who Done It? Reproducibility of Data Products Also Requires Lineage to Determine Impact and Give Credit Where Credit is Due., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12864, https://doi.org/10.5194/egusphere-egu23-12864, 2023.

EGU23-12971 | Posters on site | ESSI3.5

Reproducible quality control of time series data with SaQC

David Schäfer, Bert Palm, Peter Lünenschloß, Lennart Schmidt, and Jan Bumberger

Environmental sensor networks produce ever-growing volumes of time series data with great potential to broaden the understanding of complex spatiotemporal environmental processes. However, this growth also imposes its own set of new challenges. Especially the error-prone nature of sensor data acquisition is likely to introduce disturbances and anomalies into the actual environmental signal. Most applications of such data, whether it is used in data analysis, as input to numerical models or modern data science approaches, usually rely on data that complies with some definition of quality.

To move towards high-standard data products, a thorough assessment of a dataset's quality, i.e., its quality control, is of crucial importance. A common approach when working with time series data is the annotation of single observations with a quality label to transport information like its reliability. Downstream users and applications are hence able to make informed decisions, whether a dataset in its whole or at least parts of it are appropriate
for the intended use.

Unfortunately, quality control of time series data is a non-trivial, time-consuming, scientifically undervalued endeavor and is often neglected or executed with insufficient rigor. The presented software, the System for automated Quality Control (SaQC), provides all basic and many advanced building blocks to bridge the gap between data that is usually faulty but expected to be correct in an accessible, consistent, objective and reproducible way. Its user interfaces address different audiences ranging from the scientific practitioner with little access to the possibilities of modern software development to the trained programmer. SaQC delivers a growing set of generic algorithms to detect a multitude of anomalies and to process data using resampling, aggregation, and data modeling techniques. However, one defining component of SaQC is its innovative approach to storing runtime process information. In combination with a flexible quality annotation mechanism, SaQC allows to extend quality labels with fine-grained provenance information appropriate to fully reproduce the system's output.

SaQC is proving its usefulness on a daily basis in a range of fully automated data flows for large environmental observatories. We highlight use cases from the TERENO Network, showcasing how reproducible automated quality control can be implemented into real-world, large-scale data processing workflows to provide environmental sensor data in near real-time to data users, stakeholders and decision-makers.

How to cite: Schäfer, D., Palm, B., Lünenschloß, P., Schmidt, L., and Bumberger, J.: Reproducible quality control of time series data with SaQC, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12971, https://doi.org/10.5194/egusphere-egu23-12971, 2023.

EGU23-13108 | Orals | ESSI3.5 | Highlight

The reality of implementing FAIR principles in the IPCC context to support open science and provide a citable platform to acknowledge the work of authors.

Charlotte Pascoe, Lina Sitz, Diego Cammarano, Anna Pirani, Martina Stockhause, Molly MacRae, and Emily Anderson

A new paradigm for Intergovernmental Panel on Climate Change (IPCC) Working Group I (WGI) data publication has been implemented. IPCC Data Distribution Centre (DDC) partners at the Centre for Environmental Data Analysis (CEDA), the German Climate Computing Centre (DKRZ) and the Spanish Research Council (CSIC) have worked with the IPCC Technical Support Unit (TSU) for WGI to publish figure data from the Sixth Assessment Report (AR6). The work was guided by the IPCC Task Group on Data Support for Climate Change Assessments (TG-Data) recommendations for Open Science and FAIR data (making data Findable, Accessible, Interoperable, and Reusable) with a general aim to enhance the transparency and accessibility of AR6 outcomes. We highlight the achievement of implementing FAIR for AR6 figure data and discuss the lessons learned on the road to FAIRness in the unique context of the IPCC.

Findable - The CEDA catalogue record for each figure dataset enhances findability. Keywords can be easily searched. Records are organised into collections for each AR6 chapter. There is a two-way link between the catalogue record and the figure on the AR6 website. CEDA catalogue records are duplicated on the IPCC-DDC.
Accessible - Scientific language is understandable, acronyms and specific terminology are fully explained. CEDA services provide tools to access and download the data.
Interoperable - Where possible data variables follow standard file format conventions such as CF-netCDF and have standard names, where this is not feasible readme files describe the file structure and content.
Reusable - The data can be reused, shared and adapted elsewhere, with credit, under a Creative Commons Attribution 4.0 licence (CC BY 4.0). Catalogue records link to relevant documentation such as the Digital Object Identifier (DOI) for the code and other supplementary information. The code used to create the figures allows users to reproduce the figures from the report independently.

CEDA catalogue records provide a platform to acknowledge the specific work of IPCC authors and dataset creators whose work supports the scientific basis of AR6.

Catalogue records for figure datasets were created at CEDA with data archived in the CEDA repository and the corresponding code stored on GitHub and referenced via Zenodo. For instances where the data and code were blended in a processing chain that could not be easily separated, we developed criteria to categorise the different blends of data and code and created a decision tree to decide how best to archive them. Key intermediate datasets were also archived at CEDA.

Careful definition of metadata requirements at the beginning of the archival process is important for handling the diversity of IPCC figure data which includes data derived from climate model simulations, historical observations and other sources of climate information. The reality of the implementation meant that processes for gathering data and information from authors were specified later in the preparation of AR6. This presented challenges with data management workflows and the separation of figure datasets from the intermediate data and code that generated them.

We present recommendations for AR7 and scaling up this work in a feasible way.

How to cite: Pascoe, C., Sitz, L., Cammarano, D., Pirani, A., Stockhause, M., MacRae, M., and Anderson, E.: The reality of implementing FAIR principles in the IPCC context to support open science and provide a citable platform to acknowledge the work of authors., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13108, https://doi.org/10.5194/egusphere-egu23-13108, 2023.

EGU23-13347 | ECS | Posters on site | ESSI3.5

Transparent and reproducible data analysis workflows in Earth System Modelling combining interactive notebooks and semantic data management

Alexander Schlemmer and Sinikka Lennartz

In our project we are employing semantic data management with the Open Source research data management system (RDMS) CaosDB [1] to link empirical data and simulation output from Earth System Models [2]. The combined management of these data structures allows us to perform complex queries and facilitates the integration of data and meta data into data analysis workflows.

One particular challenge for analyses of model output is to keep track of all necessary meta data of each simulation during the whole digital workflow. Especially for open science approaches it is of great importance to properly document - in human- and computer-readable form - all the information necessary to completely reproduce obtained results. Furthermore, we want to be able to feed all relevant data from data analysis back into our data management system, so that we are able to perform complex queries also on data sets and parameters stemming from data analysis workflows.

A specific aim of this project is to re-analyse existing sets of simulations under different research questions. This endeavour can become very time consuming without proper documentation in an RDMS.

We implemented a workflow, combining semantic research data management with CaosDB and Jupyter notebooks, that keeps track of data loaded into an analysis workspace. Procedures are provided that create snapshots of specific states of the analysis. These snapshots can automatically be interpreted by the CaosDB crawler that is able to insert and update records in the system accordingly. The snapshots include links to the input data, parameter information, the source code and results and therefore provide a high-level interface to the full chain of data processing, from empirical and simulated raw data to the results. For example, input parameters of complex Earth System Models can be extracted automatically and related to model performance. In our use case, not only automated analyses are feasible, but also interactive approaches are supported.

[1] Fitschen, T.; Schlemmer, A.; Hornung, D.; tom Wörden, H.; Parlitz, U.; Luther, S. CaosDB—Research Data Management for Complex, Changing, and Automated Research Workflows. Data 2019, 4, 83. https://doi.org/10.3390/data4020083
[2] Schlemmer, A., Merder, J., Dittmar, T., Feudel, U., Blasius, B., Luther, S., Parlitz, U., Freund, J., and Lennartz, S. T.: Implementing semantic data management for bridging empirical and simulative approaches in marine biogeochemistry, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-11766, https://doi.org/10.5194/egusphere-egu22-11766, 2022.

How to cite: Schlemmer, A. and Lennartz, S.: Transparent and reproducible data analysis workflows in Earth System Modelling combining interactive notebooks and semantic data management, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13347, https://doi.org/10.5194/egusphere-egu23-13347, 2023.

EGU23-14845 | Orals | ESSI3.5

Open geospatial standards and reproducible research

Massimiliano Cannata, Gregory Giuliani, Jens Ingensand, Olivier Ertz, and Maxime Collombin

In the era of cloud computing, big data and Internet of things, research is very often data-driven: based on the analysis of data, increasingly available in large quantities and collected by experiments, observations or simulations. These data are very often characterized as being dynamic in space and time and as continuously expanding (monitoring) or change (data quality management or survey). Modern Spatial Data Infrastructures (e.g. swisstopo or INSPIRE), are based on interoperable Web services which expose and serve large quantities of data on the Internet using widely accepted and used open standards defined by the Open Geospatial Consortium (OGC) and the International Organization for Standardization (ISO). These standards mostly comply with FAIR principles but do not offer any capability to retrieve a dataset how it was in a defined instant, to refer to its status in that specific instant and to guarantee its immutability. These three aspects hinder the replicability of research based on such a kind of services. We discuss the issue here and the state of the art and propose a possible solution to fill this gap, using or extending when needed the existing standards and or adopting best practices in the fields of sensor data, satellite data and vector data.

How to cite: Cannata, M., Giuliani, G., Ingensand, J., Ertz, O., and Collombin, M.: Open geospatial standards and reproducible research, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14845, https://doi.org/10.5194/egusphere-egu23-14845, 2023.

EGU23-15384 | Orals | ESSI3.5 | Highlight

A peer review process for higher reproducibility of publications in GIScience can also work for Earth System Sciences

Daniel Nüst, Frank O. Ostermann, and Carlos Granell

The Reproducible AGILE initiative (https://reproducible-agile.github.io/) successfully established a code execution procedure following the CODECHECK principles (https://doi.org/10.12688/f1000research.51738.2) at the AGILE conference series (https://agile-online.org/conference). The AGILE conference is a medium-sized community-led conference in the domains of Geographic Information Science (GIScience), geoinformatics, and related fields. The conference is organised under the umbrella of the Association of Geographic Information Laboratories in Europe (AGILE).

Starting with a series of workshops on reproducibility from 2017 to 2019, a group of Open Science enthusiasts with the support of the AGILE Council (https://agile-online.org/agile-actions/current-initiatives/reproducible-publications-at-agile-conferences) was able to introduce guidelines for sharing reproducible workflows (https://doi.org/10.17605/OSF.IO/CB7Z8) and establish a reproducibility committee that conducts code executions for all accepted full papers.
In this presentation, we provide details of the taken steps and the encountered obstacles towards the current state. We revisit the process and abstract a series of actions that similar events or even journals may take to introduce a shift towards higher reproducibility of research publications in a specific community of practice.

We discuss the taken approach in the light of the challenges for reproducibility in Earth System Sciences (ESS) around four main ideas.
First, Reproducible AGILE’s human-centered process is able to handle the increasingly complex, large and varying data-based workflows in ESS because of the clear guidance on responsibilities (What should the author provide? How far does the reproducibility reviewer need to go?).
Second, the communicative focus of the process is very well suited to, over time, help to establish a shared practice based on current technical developments, such as FAIR Digital Objects, and to reform attitudes towards openness, transparency and sharing. A code execution following the CODECHECK principles is a learning experience that may sustainably change researcher behaviours and practice. At the same time, Reproducible AGILE’s approach avoids playing catch-up with technology and does not limit researcher freedom or includes a need to unitise researcher workflows beyond providing instructions suitable for a human evaluator, similar to academic peer review.
Third, while being agnostic of technology and infrastructures, a supportive framework of tools and infrastructure can of course increase the efficiency of conducting a code execution. We outline how existing infrastructures may serve this need and what is still missing.
Fourth, we list potential candidates of event series or journals that could introduce a code checking procedure because of their organisational setup or steps towards more open scholarhip that were already taken.

How to cite: Nüst, D., Ostermann, F. O., and Granell, C.: A peer review process for higher reproducibility of publications in GIScience can also work for Earth System Sciences, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15384, https://doi.org/10.5194/egusphere-egu23-15384, 2023.

EGU23-15391 | Posters on site | ESSI3.5

Data Management for PalMod-II – data workflow and re-use strategy

Swati Gehlot, Karsten Peters-von Gehlen, Andrea Lammert, and Hannes Thiemann

German climate research initiative PalMod phase II (www.palmod.de) is presented here as an exclusive example where the project end-product is unique, scientific paleo-climate data. PalMod-II data products include output from three state-of-the-art coupled climate models of varying complexity and spatial resolutions simulating the climate of the past 130,000 years. In addition to the long time series of modeling data, a comprehensive compilation of paleo-observation data is prepared to facilitate model-model and model-proxy intercomparison and evaluation. Being a large multidisciplinary project, a dedicated RDM (Research Data Management) approach is applied within the cross-cutting working group for PalMod-II. The DMP (Data Management Plan), as a living document, is used for documenting the data-workflow framework that defines the details of paleo-climate data life-cycle. The workflow containing the organisation, storage, preservation, sharing and long-term curation of the data is defined and tested. In order to make the modeling data inter-comparable across the PalMod-II models and easily analyzable by the global paleo-climate community, model data standardization (CMORization) workflows are defined for individual PalMod models and their sub-models. The CMORization workflows contain setup, definition, and quality assurance testing of CMIP6¹ based standardization processes adapted to PalMod-II model simulation output requirements with a final aim of data publication via ESGF². PalMod-II data publication via ESGF makes the paleo-climate data an asset which is (re-)usable beyond the project life-time.

The PalMod-II RDM infrastructure enables common research data management according to the FAIR³ data principles across all the working groups of PalMod-II using common workflows for the exchange of data and information along the process chain. Applying data management planning within PalMod-II made sure that all the data related workflows were defined, continuously updated if needed and made available to the project stakeholders. End products of PalMod-II which consist of unique long term scientific paleo-climate data (model as well as paleo-proxy data) are made available for re-use via the paleo-climate research community as well as other research disciplines (e.g., land-use, socio-economic studies etc.).

1. Coupled Model Intercomparison Project phase 6 (https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6)

2. Earth System Grid Federation (https://esgf.llnl.gov)

3. Findable, Accessible, Interoperable, Reusable

How to cite: Gehlot, S., Peters-von Gehlen, K., Lammert, A., and Thiemann, H.: Data Management for PalMod-II – data workflow and re-use strategy, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15391, https://doi.org/10.5194/egusphere-egu23-15391, 2023.

EGU23-16288 | Orals | ESSI3.5 | Highlight

The UK’s NCAS Data Project: establishing transparent observational data workflows from field to user

Graham Parton, Barbara Brooks, Ag Stephens, and Wendy Garland

Within the UK the National Centre for Atmospheric Science (NCAS) operates a suite of observational instruments for atmospheric dynamics, chemistry and composition studies. These are principally made available through two facilities: the Atmospheric Measurement and Observations Facility (AMOF) and the Facility for Airborne Atmospheric Measurements (FAAM). Between these two facilities instrumentation can be on either campaign or long-term deployed in diverse environments (from polar to maritime; surface to high altitude), on a range of platforms (aircraft, ships) or dedicated atmospheric observatories.

The wide range of instruments, spanning an operational time period from the mid 1990s to present, has traditionally been orientated to specific communities, resulting in a plethora of different operational practices, data standards and workflows. The resulting data management and usage challenges have been further exacerbated over time by changes of staff, instruments and end-user communities and their requirements. This has been accompanied by the wider end-user community seeking greater access to and improved use of the data, with necessary associated improvements in data production to ensure transparency, quality, veracity and, thus, overall reproducibility. Additionally, these enhancemed workflows further ensure FAIR data outputs, widening long-term re-use of the data.

Seeking to address these challenges in a more harmonious approach across the range of AMOF and FAAM facilities, NCAS established the NCAS Data Project in 2018 bringing together key players in the data workflows to break down barriers and common standards and procedures through improved dialogue. The resulting NCAS ‘Data Pyramid’ approach, brings together representatives from the data provider, data archive and end-user communities alongside supporting software engineers within a common framework that enables cross-working between all partners. This has lead to new data standards and workflows being established to ensure 3 key objectives: 1) capturing and flow of the necessary metadata to automate data flows and quality control as much as possible in a timely fashion ‘from field to end-user’; 2) enhanced transparency and traceability in data production via linked externally visible documentation, calibration and code repositories; and, 3) data products meeting end-user requirements in terms of their content and established quality control. Finally, data workflows are further enhanced thanks to scriptable conformance checking throughout the data production lifecycle, built on the controlled data product and metadata standards.

Thus, through the established workflows of the NCAS Data Project, the necessary details are captured and conveyed by both internal file-level and catalogue-level metadata to ensure that all three corners of the triangle of reproducibility, quality information, and provenance are able to be achieved in combination.

How to cite: Parton, G., Brooks, B., Stephens, A., and Garland, W.: The UK’s NCAS Data Project: establishing transparent observational data workflows from field to user, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16288, https://doi.org/10.5194/egusphere-egu23-16288, 2023.

EGU23-17263 | Posters on site | ESSI3.5

Towards reproducible workflows in simulation based Earth System Science

Ivonne Anders, Hannes Thiemann, Martin Bergemann, Christopher Kadow, and Etor Lucio-Eceiza

Some disciplines, e.g. Astrophysics or Earth system sciences, work with large to very large amounts of data. Storing this data, but also processing it, is a challenge for researchers because novel concepts for processing data and workflows have not developed as quickly. This problem will only become more pronounced with the ever increasing performance of High Performance Computing (HPC) – systems.

At the German Climate Computing Center, we analysed the users, their goals and working methods. DKRZ provides the climate science community with resources such as high-performance computing (HPC), data storage and specialised services and hosts the World Data Center for Climate (WDCC). In analysing users, we distinguish between two main groups: those who need the HPC system to run resource-intensive simulations and then analyse them, and those who reuse, build on and analyse existing data. Each group subdivides into subgroups. We have analysed the workflows for each identified user and found identical parts in an abstracted form and derived Canonical Workflow Modules. In the process, we critically examined the possible use of so-called FAIR Digital Objects (FDOs) and checked to what extent the derived workflows and workflow modules are actually future-proof.

We will show the analysis of the different users, the Canonical workflow and the vision of the FDOs. Furthermore, we will present the framework Freva and further developments and implementations at DKRZ with respect to the reproducibility of simulation-based research in the ESS.

How to cite: Anders, I., Thiemann, H., Bergemann, M., Kadow, C., and Lucio-Eceiza, E.: Towards reproducible workflows in simulation based Earth System Science, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17263, https://doi.org/10.5194/egusphere-egu23-17263, 2023.

EGU23-461 | ECS | PICO | ESSI3.6

Reducing the computational cost of an iterative method for sediment yield minimization by afforestation

Grethell Castillo Reyes, René Estrella, Karen Gabriels, Jos Van Orshoven, Floris Abrams, and Dirk Roose

Afforestation of certain areas of a river catchment can reduce the outflow of sediment from that catchment. We developed algorithms and software, called CAMF, to minimize the sediment outflow, based on a) a model for local sediment production, b) some parameters such as retention capacity and saturation threshold, c) a raster geo-database containing elevation data and land use. The software can also be adapted to model the effect of other actions than afforestation. We implemented both Single Flow Direction (SFD) and Multiple Flow Direction (MFD) methods to simulate flow transport. We analyze the differences between the two approaches. With the use of MFD methods the spatial interaction increases. As a consequence, the flow simulation with CAMF-MFD, executed in each iteration of the minimization procedure, has a substantially higher computational cost. The total execution time of CAMF can be prohibitively expensive for large geo-databases, since in each iteration only the cell(s) with the maximum sediment outflow reduction are selected.

In each iteration of the minimization procedure, a sediment flow simulation is performed for each candidate cell. Since these simulations are independent of each other, we parallelized CAMF for multi-core processors using Open Multi-Processing (OpenMP) directives. Each thread executes the simulations for a subset of the candidate cells. To distribute the simulations over threads, dynamic scheduling is used to handle the imbalance due to the varying execution time of the simulations for each candidate cell. We also adapted the algorithm in two ways to accelerate the execution. First, in each iteration several cells, that produce nearly the same sediment yield reduction at the outlet, are selected. A threshold T determines the number of selected cells. Second, a complete ranking of all cells, with respect to their potential for sediment yield reduction by afforestation, is only computed every K iterations, while in intermediate iterations only N cells are ranked, namely those at the top of the previous complete ranking. The values for T, K and N substantially reduce the computational cost, while the solution quality is typically only slightly lower.

We evaluated the performance of the accelerated variant for minimizing sediment outflow by afforestation using a raster geo-database of the Tabacay catchment (Ecuador), with a cell size of 30m × 30m. From a total of 73 471 non-null cells, 27 246 are candidate cells for afforestation.

A high speedup is obtained for up to 28 cores (≈22), leading to a substantial reduction of the execution time. The accelerated variant produces nearly the same yield reduction at the outlet and selects almost the same cells than the original CAMF-MFD. The difference between the set of cells selected by both algorithms is measured by the relative spatial coincidence RSC. Results show that in all considered cases RSC > 99%.

How to cite: Castillo Reyes, G., Estrella, R., Gabriels, K., Van Orshoven, J., Abrams, F., and Roose, D.: Reducing the computational cost of an iterative method for sediment yield minimization by afforestation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-461, https://doi.org/10.5194/egusphere-egu23-461, 2023.

EGU23-1611 | PICO | ESSI3.6

The Use of the OGC API Standards for Developing Open Environmental Data in Poland

Agnieszka Zwirowicz-Rutkowska and Paweł Soczewski

The term ‘open data’ is used for data that anyone is free to access, use, modify and share. Two dimensions of data openness are recognized: data must be both legally and technically open. In the area of open spatial data, considered as public data available on the web and the component of the geospatial infrastructures, from the technical point of view providing access to data according to open data principles could be implemented in many different ways, including services, geoportals or bulk access. As the Web of data evolves the spatial data publication issue is essential, but also challenging when considering Web ecosystem. At the same time, the recast Directive (EU) 2019/1024 on open data and the re-use of public sector information recommends, and in case high-value datasets to publish data via an application programming interface (API) so as to facilitate the development of internet, mobile and cloud applications based on such data. The latest best practices OGC, W3C and INSPIRE recommend standards from the OGC APIs group as a modern way of sharing spatial data via the API interface.

The first objective of the presentation is to demonstrate the access point to the spatial data of the Polish Environmental Monitoring and National Pollutant Release and Transfer Register implementing the OGC API - Features standard. The second objective is to present the results of the assessment of the environmental geoportal openness and to discuss the concept of ‘open by design’ in the area of spatial data infrastructures development. The presented study adds to comparative analysis of spatial data openness in different countries and also communities dealing with information on the state of the environment, as well as to the experience of geospatial infrastructure development.

How to cite: Zwirowicz-Rutkowska, A. and Soczewski, P.: The Use of the OGC API Standards for Developing Open Environmental Data in Poland, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1611, https://doi.org/10.5194/egusphere-egu23-1611, 2023.

EGU23-1882 | ECS | PICO | ESSI3.6

GeoCos v2.0: An open source web application for calculating Chance of Success values of exploration wells

Ayberk Uyanik

In hydrocarbon exploration, evaluation of a prospect’s success ratio relies on assessment of each petroleum system element and combination of it into a single risk factor. This makes estimation of chance of success values crucial to reduce the risks of exploration and to make robust investments for a particular region. To standardise this process, a couple of methods, both for global and basin-scale use, have been proposed in the last 20 years. All of them are table-based methods suggesting explorationists to pick the correct risk values according to the geological conditions they are encountering. However, examining the tables each time can be quite time-consuming. In addition, it can also cause subjective or biased picking of success values, resulting with under or overestimation of risk factors. To prevent repetition and miscalculations, GeoCos offers a web-based application by turning all table-based methods into interactive selection schemes. Users can choose geological conditions defined in table based methods and display the results. In addition, predictions of three table-based methods can be compared in spider or bar charts. Generated figures can be downloaded to implement them for publishment or presentations. There are direct links for the papers of the published methods as well.

Development phase is consisted of two processes as front-end and back-end. Front-end development is based on HTML-CSS-Vanilla Javascript-Plotly.JS while for the back-end Node.JS completes the server side. Open source nature of this practical tool makes it easy to contribute for further developments. Source codes are available at the GitHub repository; https://github.com/Ayberk-Uyanik/GeoCos-v2.0

How to cite: Uyanik, A.: GeoCos v2.0: An open source web application for calculating Chance of Success values of exploration wells, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1882, https://doi.org/10.5194/egusphere-egu23-1882, 2023.

EGU23-4038 | PICO | ESSI3.6

Reproducible Open Source Carbon Cycle Models Biogeochemical Model Database bgc_md2

Markus Müller, Holger Metzler, Verónika Ceballos Núñez, Kostiantyn Viatkin, Thomas Lotze, Jon Wells, Yu Zhou, Cuijuan Liao, Aneesh Chandel, Feng Tao, Yuanyuan Huang, Alison Bennett, Chenyu Bian, Lifen Jiang, Song Wang, Chengcheng Gang, Carlos Sierra, and Yiqi Luo

How to formulate and run an element cycling model within a day?
How to compare many models with respect to many different diagnostics reliably?
How to allow models to be formulated in different ways?
How to make runnable models transparent without implementation details obscuring the scientific content?

bgc_md2 is an open source python library, available on github and binder.
It provides an extendable set of datatypes that capture the essential properties compartmental models have in common and enables the formulation of a model with a few lines of regular python code. The structure of the model is captured in symbolic math (using sympy) and can be checked during the creation of the model e.g. by drawing a carbon flow diagram or printing the flux equations using the same mathematical symbols used in the publication describing the model.
This can be done long before a complete parameter set for the model is added and the model can be run e.g. for a benchmark.
The computation of diagnostic variables both symbolic or numeric is based on the common building blocks which avoids the effort, obscurity and possible inconsistency resulting from a model specific implementation. The difference in available data for different models is addressed by computational graphs.
Instead of an inflexible schema for a relational database records can have different entries reflecting the available data.
Using the computability graphs the comparable is extended to the computable data. This allows for instance comparing a model described by a collection of fluxes with one described by matrices.
bgc_md2 is an extendable library that provides complex and well tested tools for model comparison but does not force the user into a rigid framework.
Rather than full automation it aims at flexibility of use within the python universe and can be used interactively in a jupyter notebook as well as in parallel computations on a supercomputer for global data assimilation as we do in a current model inter comparison.
Example jupyter notebooks can be explored interactively on binder without installation.

How to cite: Müller, M., Metzler, H., Ceballos Núñez, V., Viatkin, K., Lotze, T., Wells, J., Zhou, Y., Liao, C., Chandel, A., Tao, F., Huang, Y., Bennett, A., Bian, C., Jiang, L., Wang, S., Gang, C., Sierra, C., and Luo, Y.: Reproducible Open Source Carbon Cycle Models Biogeochemical Model Database bgc_md2, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4038, https://doi.org/10.5194/egusphere-egu23-4038, 2023.

EGU23-4453 | ECS | PICO | ESSI3.6

Towards a sustainable utilization of the global hydrological research software WaterGAP

Emmanuel Nyenah, Robert Reinecke, and Petra Döll

Global hydrological models are used for understanding, monitoring, and forecasting the global freshwater system. Their outputs provide crucial water-related information for various audiences, such as scientists and policymakers. WaterGAP is such a global hydrological model, and it has been utilized extensively to assess water scarcity for humans and ecologically relevant streamflow characteristics, considering the impacts of human water use and man-made reservoirs as well as of climate change.

The WaterGAP research software has been developed and modified by researchers with diverse programming backgrounds for over 30 years. During this time, there has been no clear-cut protocol for software development and no defined software architecture; hence the current state of the software is a collection of over a thousand lines of code with little modularity and documentation. As a result, it is challenging for new model developers to understand the current software and improve or extend the model algorithm. Also, it is almost impossible to make the software available to other researchers (e.g., For the reproduction of research results).

Here we present ReWaterGAP, an ongoing project to redevelop WaterGAP into a sustainable research software (SRS). We define SRS as software that (1) is maintainable, (2) is extensible, (3) is flexible (adapts to user requirements), (4) has a defined software architecture, (5) has a comprehensive in-code and external documentation, and (6) is accessible (the software is licensed as Open Source with a DOI (digital object identifier) for proper attribution). The goal is to completely rewrite the software WaterGAP from scratch with a modular structure using a modern programming language and state-of-the-art software architecture, and to provide extensive documentation so that the resulting software fulfills the requirements of a SRS while maintaining good computational performance.

In our presentation, we provide insights into our ongoing reprogramming, outline milestones, and provide an overview of applied best practices from the computer science community (such as internal and external code review, test-driven development, and agile development methods). We plan to share the software development lessons we have learned along the way with the scientific community to help them improve their software.

How to cite: Nyenah, E., Reinecke, R., and Döll, P.: Towards a sustainable utilization of the global hydrological research software WaterGAP, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4453, https://doi.org/10.5194/egusphere-egu23-4453, 2023.

EGU23-4682 | PICO | ESSI3.6

Unidata Science Gateway: A research infrastructure to advance research and education in the Earth System Sciences

Mohan Ramamurthy, Julien Chastang, and Ana Espinoza

Unidata has developed and deployed data infrastructure and data-proximate scientific workflows and software tools using cloud computing technologies for accessing, analyzing, and visualizing geoscience data. These resources are provided to educators and researchers through the Unidata Science Gateway (https://science-gateway.unidata.ucar.edu) and deployed on the U. S. National Science Foundation funded Jetstream/Jetstream2 cloud computing facility. During the SARS-CoV-2/COVID-19 pandemic, the Unidata Science Gateway has been used by many universities to teach data-centric atmospheric science courses and conduct several software training workshops to advance skills in data science.

The COVID-19 pandemic led to the closure of university campuses with little advance notice. Educators at institutions of higher learning had to urgently transition from in-person teaching to online classrooms. While such a sudden change was disruptive for education, it also presented an opportunity to experiment with instructional technologies that have been emerging for the last few years. Web-based computational notebooks, with their mixture of explanatory text, equations, diagrams and interactive code are an effective tool for online learning. Their use is prevalent in many disciplines including the geosciences. Multi-user computational notebook servers (e.g., Jupyter Notebooks) enable specialists to deploy pre-configured scientific computing environments for the benefit of learners and researchers. The use of such tools and environments removes barriers for learners who otherwise have to download and install complex software tools that can be time consuming to configure, simplifying workflows and reducing time to analysis and results. It also provides a consistent computing environment for everyone, lowering the barrier to access to data and tools. These servers can be provisioned with computational resources not found in a desktop computing setting and leverage cloud computing environments and high-speed networks..

The Unidata Science Gateway hosts more than a Terabyte of real-time weather data each day from nearly 30 different data streams. In addition, many analysis and visualization tools are made available via the Science Gateway and they are linked to the aforementioned real-time data.

Since spring 2020, when the Covid pandemic led to the closure of universities across the world, Unidata has assisted many earth science departments with computational notebook environments for their classes and labs. As of now, we have worked with educators at more than 18 universities to tailor these resources for their teaching and learning objectives. We ensured the technology was correctly provisioned with appropriate computational resources and collaborated to have teaching material immediately available for students. There were many successful examples of online learning experiences.

In this presentation, we describe the details of the Unidata Science Gateway resources and discuss how those resources enabled Unidata to support universities during the COVID-19 lockdown. We will also discuss how Unidata is re-imagining the role of its Science Gateway as a community hub, where university faculty are not only users of the gateway services but also content creators as well as contributors to it and share their products and resources.

How to cite: Ramamurthy, M., Chastang, J., and Espinoza, A.: Unidata Science Gateway: A research infrastructure to advance research and education in the Earth System Sciences, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4682, https://doi.org/10.5194/egusphere-egu23-4682, 2023.

EGU23-7286 | PICO | ESSI3.6

Comparative performance study of TSMP under homogeneous, heterogeneous and modular configurations

Daniel Caviedes-Voullième, Jörg Benke, Ghazal Tashakor, Ilya Zhukov, and Stefan Poll

The compartmentalised (modular) design often found in multiphysics Earth system models allows for progressive offloading of compute-intensive kernels to accelerators. The nature of this process implies that some model components will run on accelerators, while other components will continue to run on CPUs, leading to the use of heterogeneous HPC architectures. Furthermore, different hardware architectures (e.g. CPUs, GPUs, quantum, neuromorphic) within an HPC system can be grouped into modules, each tailored to the requirements of a particular class of algorithms and software, and interconnected with other modules via a shared network. Some of these modules may be focused on energy-efficient scalability, whereas others may be disruptive and experimental. Such a conglomerate of different hardware modules, where each module can work stand-alone or in combination with other modules, leads to the idea of modular supercomputer architecture (MSA). The first exascale system in Europe (JUPITER) is expected to be modular, following on from the experience of the JUWELS system. This new paradigm poses questions on how performance and scalability of models change from homogeneous, to heterogeneous to modular systems.

The Terrestrial Systems Modelling Platform (TSMP) is a scale-consistent, highly modular, massively parallel, fully integrated soil-vegetation-atmosphere modelling system coupling an atmospheric model (COSMO), a land surface model (CLM), and a hydrological model (ParFlow), linked together by means of the OASIS3-MCT library. Each of these submodels can be considered as a module, with different domain sizes, computational loads and scalability. This implies that optimal configurations for solving a given problem require understanding many levels of non-trivial load balancing. It is currently possible to offload ParFlow to GPUs, while keeping COSMO and CLM on CPUs. This enables both heterogeneous and modular configuration, and thus prompts the need to re-evaluate load distribution and scalability to find new optimal configurations. In a previous study, preliminary results on heterogeneous configurations were presented (https://doi.org/10.5194/egusphere-egu22-10006).

In this contribution, we extend our study and present a comparative study of performance and scaling for homogeneous, heterogeneous, and modular TSMP jobs. We study strong and weak scaling, for different problem sizes, and evaluate parallel efficiency on all three configurations in the JUWELS supercomputer. We further explore traces of selected cases, to identify changes in behaviour under the different configurations, such as emergent MPI communication bottlenecks and root causes of the load balancing issues.

How to cite: Caviedes-Voullième, D., Benke, J., Tashakor, G., Zhukov, I., and Poll, S.: Comparative performance study of TSMP under homogeneous, heterogeneous and modular configurations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7286, https://doi.org/10.5194/egusphere-egu23-7286, 2023.

EGU23-7461 | ECS | PICO | ESSI3.6

Evaluation of Native Earth System Model Output with ESMValTool

Manuel Schlund, Birgit Hassler, Axel Lauer, Bouwe Andela, Lisa Bock, Patrick Jöckel, Rémi Kazeroni, Saskia Loosveldt Tomas, Brian Medeiros, Valeriu Predoi, Stéphane Sénési, Jérôme Servonnat, Tobias Stacke, Javier Vegas-Regidor, Klaus Zimmermann, and Veronika Eyring

Projections from Earth system models (ESMs) are essential to allow for targeted mitigation and adaptation strategies for climate change. ESMs are state-of-the-art numerical climate models used to simulate the vastly complex Earth system including physical, chemical, and biological processes in the atmosphere, ocean, and on land. Progress in climate science and an increase in available computing resources over the last decades has led to a massive increase in the complexity of ESMs and the amount of data and insight they provide. For this reason, innovative tools for a frequent and comprehensive model evaluation are required more than ever. One of these tools is the Earth System Model Evaluation Tool (ESMValTool), an open-source community diagnostic and performance metrics tool.

Originally designed to assess output from ESMs participating in the Coupled Model Intercomparison Project (CMIP), ESMValTool expects input data to be formatted according to the CMOR (Climate Model Output Rewriter) standard. While this CMORization of model output is a quasi-standard for large model intercomparison projects like CMIP, this complicates the application of ESMValTool to non-CMOR-compliant data, like native climate model output (i.e., operational output produced by running the climate model through the standard workflow of the corresponding modeling institute). Recently, ESMValCore, the framework underpinning ESMValTool, has been extended to enable reading and processing native climate model output. This is implemented via a CMOR-like reformatting of the input data during runtime. For models using unstructured grids, data can optionally be regridded to a regular latitude-longitude grid to facilitate comparisons with other data sets. The new features are described in more detail in Schlund et al., 2022 (https://doi.org/10.5194/gmd-2022-205) and in the software documentation available at https://docs.esmvaltool.org/en/latest/input.html#datasets-in-native-format.

This extension opens up the large collection of diagnostics provided by ESMValTool for the five currently supported ESMs CESM2, EC-Earth3, EMAC, ICON, and IPSL-CM6. Applications include assessing the models’ performance against observations, reanalyses, or other simulations; the evaluation of new model setups against predecessor versions; the CMORization of native model data for contributions to model intercomparison projects; and monitoring of running climate model simulations. Support for other climate models can be easily added. ESMValTool is an open-source community-developed tool and contributions from other groups are very welcome.

How to cite: Schlund, M., Hassler, B., Lauer, A., Andela, B., Bock, L., Jöckel, P., Kazeroni, R., Loosveldt Tomas, S., Medeiros, B., Predoi, V., Sénési, S., Servonnat, J., Stacke, T., Vegas-Regidor, J., Zimmermann, K., and Eyring, V.: Evaluation of Native Earth System Model Output with ESMValTool, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7461, https://doi.org/10.5194/egusphere-egu23-7461, 2023.

EGU23-7760 | PICO | ESSI3.6

Google Earth Engine Climate Tool (GEEClimT): Enabling rapid, easy access to global climate reanalysis data

James Lea, Robert Fitt, Stephen Brough, Georgia Carr, Jonathan Dick, Natasha Jones, Eli Saetnan, and Richard Webster

Investigation of local to global scale environmental change is frequently underpinned by data from climate reanalysis products, yet access to these can be challenging for both new and established researchers. The practicalities of working with reanalysis data often includes handling large data files that can place limit users on the scale of analysis they can undertake; and working with specialist data formats (e.g. NetCDF, GRIB) that can pose significant barriers to entry for those who may be unfamiliar with them. Together, these factors are limiting the uptake and application of climate reanalysis data within both research and teaching of environmental science.

Here we present the Google Earth Engine Climate Tool (GEEClimT), providing an intuitive “point and click” graphical user interface (GUI) for easy extraction of data from 17 climate reanalysis data products relating to atmospheric and oceanic variables (including, but not limited to: ERA5; ERA5-Land; NCEP/NCAR; MERRA; and HYCOM). The GUI is built within the Google Earth Engine geospatial cloud computing platform, meaning users only require an internet connection to rapidly obtain both point data and area averages for user defined regions of interest. To ensure a wide range of usability for researchers, students and instructors, both the GUI and its documentation have been co-created with those who may use reanalysis data for research, teaching, and project purposes. The tool has also been designed with flexibility in mind, allowing it to be easily updated as new datasets become available within the Google Earth Engine data catalogue.

GEEClimT is shown to allow users with little or no previous experience of working with climate reanalysis data or coding to obtain temporally comprehensive data for their regions and time periods of interest. Case studies demonstrating the application of the tool to different environmental and ecological settings are presented, showcasing its potentially wide applicability to both research and teaching across environmental science.

How to cite: Lea, J., Fitt, R., Brough, S., Carr, G., Dick, J., Jones, N., Saetnan, E., and Webster, R.: Google Earth Engine Climate Tool (GEEClimT): Enabling rapid, easy access to global climate reanalysis data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7760, https://doi.org/10.5194/egusphere-egu23-7760, 2023.

EGU23-8738 | PICO | ESSI3.6

Open Earth Engine Toolbox, code goodies and extension for Google Earth Engine

Mathieu Gravey

The Open Earth Engine Toolbox (OEET) is an innovative and highly effective suite of tools that makes it easier than ever to work with Google Earth Engine. Comprised of two main components - the Open Earth Engine Library (OEEL) and the Open Earth Engine Extension (OEEex) - the OEET is a true game-changer for anyone working in the field of geospatial analysis and data processing.

The OEEL is a set of JavaScript code libraries that provide a wide range of functions and capabilities for working with Earth Engine. From advanced filtering techniques such as the Savitzky-Golay and Otsu algorithms, to powerful visualization tools like north arrows, map scales, mapshots… the OEEL has everything you need to get the most out of Earth Engine. And with a convenient Python wrapper, it's easy to integrate the OEEL into your existing workflow, scripts or notebooks.

But the OEEex is where the OEET really shines. This Chrome extension is designed you to work in tandem with the OEEL, unlocking a host of additional features and capabilities. One of the standout features of the OEEex is its ability to all run tasks in a single step. This is particularly useful for those working with large datasets or for those who need to perform the same tasks repeatedly. With the OEEex, you can simply set up your tasks and then let the extension handle the rest, saving you time and effort of clicking on each. The OEEex also offers a range of customization options for the interface. These include the ability to switch to a dark mode, which can be easier on the eyes during long work sessions, as well as the ability to adjust font sizes to suit your personal preference. It allows to utilize Plotly within Earth Engine to get beautiful figure, and much more.

Overall, the OEET is an essential tool for anyone looking to get the most out of Google Earth Engine. Its powerful JavaScript libraries, convenient Python wrapper, and feature-rich Chrome extension make it the go-to choice for geospatial analysis and data processing.

How to cite: Gravey, M.: Open Earth Engine Toolbox, code goodies and extension for Google Earth Engine, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8738, https://doi.org/10.5194/egusphere-egu23-8738, 2023.

EGU23-9782 | PICO | ESSI3.6

User-friendly climate data discovery and analysis with ESMValCore

Bouwe Andela, Peter Kalverla, Remi Kazeroni, Saskia Loosveldt Tomas, Valeriu Predoi, Manuel Schlund, Stef Smeets, and Klaus Zimmermann

The Earth System Grid Federation (ESGF) offers a wealth of climate data that can be used to do interesting research. For example, the latest edition of the Coupled Model Intercomparison Project (CMIP6) output features 20 petabytes of data. However, the heterogeneity of the data can make it difficult to find and work with. Here, we present new features of ESMValCore, a Python package designed to work with large climate datasets available from ESGF and beyond. ESMValCore now provides a Python interface that makes it easy to discover what data is available on ESGF and locally, download it if necessary, and make it analysis-ready. The analysis-ready data can then be used as input to the ESMValCore preprocessor functions, a collection of functions that can be used to perform commonly used analysis steps such as regridding and statistics. When searching for data on ESGF as well as when loading the NetCDF files, the software intelligently corrects small issues in the metadata that otherwise make working with this data a time-consuming, manual effort. Data and metadata issues are fixed in memory for fast performance. The search and download features are user-friendly and will automatically use a different server if one of the ESGF servers is unavailable for some reason. Several Jupyter notebooks demonstrating these new features are available at https://github.com/ESMValGroup/ESMValCore/tree/main/notebooks.

ESMValCore has been designed for use on computing systems that are typically used by researchers: it works well on a laptop or desktop computer, but also comes with example configuration files for use on large compute clusters attached to ESGF nodes. For reliable computations, ESMValCore makes use of the Iris library developed by the UK Met Office. This in turn is built on top of Dask, a library for efficient parallel computations with a low memory footprint. In 2023, we aim to improve our use of Dask in collaboration with the Iris developers, for even better computational performance.

For easy reproducibility, ESMValCore also offers “recipes” in which standard analyses can be saved. A large collection of such recipes is available in the Earth System Model Evaluation Tool (ESMValTool). ESMValTool started out as a set of community-developed diagnostics and performance metrics for the evaluation of Earth system models. Recently it has also turned out to be useful for other users of climate data, such as hydrologists and climate change impact researchers. Both ESMValCore and ESMValTool are developed by and for researchers working with climate data, with the support of several research software engineers. An important recent achievement is the use of these packages to produce the figures for several chapters of the IPCC AR6 report. Documentation for both ESMValCore and ESMValTool is available at https://docs.esmvaltool.org.

How to cite: Andela, B., Kalverla, P., Kazeroni, R., Loosveldt Tomas, S., Predoi, V., Schlund, M., Smeets, S., and Zimmermann, K.: User-friendly climate data discovery and analysis with ESMValCore, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9782, https://doi.org/10.5194/egusphere-egu23-9782, 2023.

EGU23-10166 | PICO | ESSI3.6

Rust Geodesy: a new platform for experiments with geodetic software

Thomas Knudsen

Rust Geodesy (RG), is an open source platform for experiments with geodetic software, transformations, and standards. RG vaguely resembles the well-known open source PROJ transformation system, and was built on the basis of experiments with alternative data flow models for PROJ. The actual transformation functionality of RG is, however, minimal: At time of writing, it includes just a few low level operations, including:

The three, six, seven, and fourteen-parameter versions of the Helmert transformation
Horizontal and vertical grid shift operations
Helmert's companion, the cartesian/geographic coordinate conversion
The full and abridged versions of the Molodensky transformation
Three widely used conformal projections: The Mercator, the Transverse Mercator, and the Lambert Conformal Conic projection
The Adapt operator, which mediates between various conventions for coordinate units and axis order
Also, RG provides access to a large number of primitives from geometrical geodesy, all wrapped as methods on a data model unifying the representation of two- and three-axis ellipsoids.

While this is sufficient to test the architecture, and while supporting the most important transformation primitives and three of the most used map projections makes it surprisingly useful, it is a far cry from PROJ's enormous gamut of supported map projections: RG is a platform for experiments, not for operational setups.

Fundamentally RG is a geodesy, rather than cartography library. And while PROJ benefits from four decades of reality hardening, RG, being a platform for experiments, does not even consider development in the direction of operational robustness. Hence, viewing RG as a PROJ replacement, will lead to bad disappointment.

That said, being written in Rust, with all the memory safety guarantees Rust provides, RG by design avoids a number of pitfalls that are explicitly worked around in the PROJ code base, so the miniscule size of RG (as measured in number of code lines) compared to PROJ, is not just a matter of functional pruning, but also a matter of development using a tool wonderfully suited for the task at hand.

Also, having the advantage of learning from PROJ experience, both from a user's and a developer's perspective, RG is significantly more extensible than PROJ, so perhaps for a number of applications, and despite its limitations, RG may be sufficient, and perhaps even useful. First and foremost, however, RG may be a vehicle for geodetic development work, eventually feeding new functionality and new transformations into the PROJ ecosystem.

Aims

Dataflow experimentation is just one aspect of RG. Overall, the aims are fourfold:

Support experiments for evolution of geodetic standards.
Support development of geodetic transformations.
Hence, provide easy access to a number of basic geodetic operations, not limited to coordinate operations.
Support experiments with data flow and alternative abstractions. Mostly as a tool for the other 3 aims

All four aims are guided by a wish to amend explicitly identified shortcomings in the existing geodetic system landscape.

How to cite: Knudsen, T.: Rust Geodesy: a new platform for experiments with geodetic software, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10166, https://doi.org/10.5194/egusphere-egu23-10166, 2023.

EGU23-10496 | ECS | PICO | ESSI3.6

Index construction: a pipeline approach for transparency and diagnostics

H. Sherry Zhang, Dianne Cook, Patricia Menendez, Ursula Laa, and Nicolas Langrene

Indexes are commonly used to combine multivariate information into a single number for monitoring, communicating, and decision-making. They are applied in many areas including the environment (e.g. drought index, Southern Oscillation Index), and the economy (e.g. Consumer Price Index, FTSE). Developers, analysts, and policymakers tend to have their favorite indexes, but there is little transparency about their performance. Indexes are used like black boxes---raw data is entered and a single number is returned---with scarce attention paid to diagnostics. Interestingly, though, all indexes can be constructed using a data pipeline in a series of well-defined steps, regardless of their origin. This talk will explain this, and how you can use this structure to inspect the behavior of indexes in different scenarios. Our work coordinates the vast array of index research and development into a simple set of building blocks. This modular data pipeline is implemented in an R package, which contains some standard indexes, and allows others to be easily coded. We will illustrate the benefits of this framework using the drought index, and show how different versions, created by different parameter choices, can lead to potentially varied decisions.

How to cite: Zhang, H. S., Cook, D., Menendez, P., Laa, U., and Langrene, N.: Index construction: a pipeline approach for transparency and diagnostics, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10496, https://doi.org/10.5194/egusphere-egu23-10496, 2023.

EGU23-11204 | PICO | ESSI3.6

I3S, an OGC 3D Streaming Standard Enabling Geospatial Interoperability and Composability

Tamrat Belayneh

Indexed 3D Scene Layers (I3S), an OGC Community Standard for streaming and storing massive amounts of geospatial content has been rapidly evolving to capture new use cases and techniques to advance geospatial visualization and analysis. As an OGC Community Standard, I3S has been evolving over the last 4 years adopting new use cases and capability. The current version of OGC I3S 1.3 adopted in Dec. 2022 enables efficient transmission of various 3D geospatial data types including discrete 3D objects with attributes, integrated surface meshes and point cloud data covering vast geographic areas as well as highly detailed BIM (Building Information Model) content, to web browsers, mobile apps and desktop.

As an open standard, I3S has been embraced by the Free and Open-Source Software (FOSS) Community for streaming massive 3D geospatial content. Enabling composition of 3d geospatial content covering different disciplines and use cases is the strength of I3S. In this paper, we’ll describe and demonstrate, including via applicable code snippets and sandcastles, I3S consumption in popular web-based 3D visualization applications such as loaders.gl and CesiumJS.

How to cite: Belayneh, T.: I3S, an OGC 3D Streaming Standard Enabling Geospatial Interoperability and Composability, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11204, https://doi.org/10.5194/egusphere-egu23-11204, 2023.

EGU23-12167 | PICO | ESSI3.6

V-FOR-WaTer goes ISABEL - Current developments in the V-FOR-WaTer Web Portal

Marcus Strobl, Elnaz Azmi, Alexander Dolich, Sibylle K. Hassler, Mirko Mälicke, Ashish Manoj J, Jörg Meyer, Achim Streit, and Erwin Zehe

The amount of digitally available environmental data and methods to process these data is continuously increasing. With the DFG project ISABEL, we build on the existing virtual research environment V-FOR-WaTer to support making this data abundance available in an easy-to-use web portal, foster data publications, and facilitate data analyses. Environmental scientists get access to data from different sources, e.g. state offices or university projects, and can share their own data and tools through the portal. Already integrated tools help to easily pre-process and scale the data and make them available in a consistent format.

V-FOR-WaTer already contains many of the necessary functionalities to provide and display data from various sources and disciplines. The detailed metadata scheme is adapted to water and terrestrial environmental data. Present datasets in the web portal originate from university projects and state offices. A connection of V-FOR-WaTer to the GFZ Data Services, an established repository for geoscientific data, will ease publication of data from the portal and in turn provides access to datasets stored in this repository. Key to being compatible with GFZ Data Services and other systems is the compliance of the metadata scheme with international standards (INSPIRE, ISO19115).

The web portal is designed to facilitate typical workflows in environmental sciences. Map operations and filter options ensure easy selection of the data, while the workspace area provides tools for data pre-processing, scaling, and common hydrological applications. The toolbox also contains more specific tools, e.g. for geostatistics and for evapotranspiration. It is easily extendable and will ultimately include user-developed tools, reflecting the current research topics and methodologies in the hydrology community. Tools are accessed through Web Processing Services (WPS) and can be joined, saved and shared as workflows, enabling complex analyses and ensuring reproducibility of the results.

How to cite: Strobl, M., Azmi, E., Dolich, A., Hassler, S. K., Mälicke, M., Manoj J, A., Meyer, J., Streit, A., and Zehe, E.: V-FOR-WaTer goes ISABEL - Current developments in the V-FOR-WaTer Web Portal, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12167, https://doi.org/10.5194/egusphere-egu23-12167, 2023.

EGU23-13285 | ECS | PICO | ESSI3.6

Freva is dead, long live Freva! New features of a software framework for the Earth System community

Christopher Kadow, Etor E. Lucio-Eceiza, Martin Bergemann, Andrej fast, Hannes Thiemann, and Thomas Ludwig

Freva (the Free Evaluation System Framework [1; 2]) is a platform developed by the earth science community for the earth science community. Designed to work over HPC environments, it efficiently handles the data search and analysis of large projects, institutes or universities. Written on python, the framework has undergone a major update of the core. Freva offers:

A centralized access. Freva comes in three different flavours with similar functionalities: a command line interface, a web user interface, and a python module that allows the usage of Freva in python environments, like jupyter notebooks.
A standardized data search. Freva allows for a quick and intuitive search of several datasets stored centrally. The datasets are internally indexed in a SOLR server with an implemented metadata system that satisfies the international standards provided by the Earth System Grid Federation.
Flexible analysis. Freva provides a common interface for user defined data analysis tools to plug them in to the system irrespective of the used language. Each plugin can be encapsulated in a personalized conda environment, facilitating the reproducibility and portability to any other Freva instance. These plugins are able to search from and integrate own results back to the database, enabling an ecosystem of different tools. This environment fosters the interchange of results and ideas between researchers, and the collaboration between users and plugin developers alike.
Transparent and reproducible results. The analysis history and parameter configuration (including tool and system Git versioning) of every plugin run is stored in a MariaDB database. Any analysis configuration and result can be consulted and shared among the scientists, offering traceability in line with FAIR data principles, and optimizing the usage of computational and storage resources.

Freva has also experienced an upgrade on the sysadmin side:

Painless deployment via Ansible, with a highly customizable configuration of the services via Docker.
Secure system configuration via Vault integration.
Straightforward migration from old Freva database servers or between Freva instances.
Improvements in the dataset incorporation.
Automatic backup of database and SOLR services.

[1] https://www.freva.dkrz.de/
[2] https://github.com/FREVA-CLINT/freva-deployment

How to cite: Kadow, C., Lucio-Eceiza, E. E., Bergemann, M., fast, A., Thiemann, H., and Ludwig, T.: Freva is dead, long live Freva! New features of a software framework for the Earth System community, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13285, https://doi.org/10.5194/egusphere-egu23-13285, 2023.

EGU23-13429 | ECS | PICO | ESSI3.6

Effective Data Compression for Geosciences: An open-source solution to combat data storage challenges

Oriol Tinto and Robert Redl

Data storage is a critical challenge in science as the amount of data being generated continues to increase. Geosciences and weather are not an exception. To address this challenge, data reduction techniques are required. Even though lossless compression methods might sound ideal, they don't usually work that well when it comes to compressing geoscience data because this data often has a lot of uncertainty resulting in random bits, which makes it hard to compress. Lossy compression methods can do a better job compressing geoscience data by getting rid of the insignificant details that make it hard to compress. Hopefully, research groups have been working on this problem for years and have published open source tools that can provide high compression ratios with acceptable error levels. However, these methods have not yet been widely adopted in geosciences due to concerns about the loss of information that they entail. By combining these best open source lossy compressors into a single easy-to-use package and showing its effectiveness through its use in several weather applications, we have created a powerful and user-friendly open-source tool that effectively helps reduce data storage needs while preserving scientific conclusions.

How to cite: Tinto, O. and Redl, R.: Effective Data Compression for Geosciences: An open-source solution to combat data storage challenges, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13429, https://doi.org/10.5194/egusphere-egu23-13429, 2023.

EGU23-13778 | ECS | PICO | ESSI3.6

Using satellite probabilistic estimates to assess modeled relative humidity : application to a NWP model

Chloé Radice, Hélène Brogniez, Pierre-Emmanuel Kirstetter, and Philippe Chambon

Assessing model forecasts using remote sensing data is often and generally done by confronting past simulations to observations. We developed a novel probabilistic comparison method that evaluates tropical atmospheric relative humidity profiles simulated by the global numerical model for weather forecasting ARPEGE (Météo France) using probability density functions of finer scale satellite observations as reference.

The global relative humidity field is simulated by ARPEGE every 6 hours on a 0.25 degree grid over 18 vertical levels ranging from 100 hPa to 950 hPa. The reference relative humidities are retrieved from brightness temperatures measured by SAPHIR, the passive microwave sounder onboard satellite Megha-Tropiques. SAPHIR has a footprint resolution ranging from 10 km at nadir to 23 km at the edge of the swath, with a vertical resolution of 6 vertical pressure layers (also from 100 hPa to 950 hPa). Due to the particular orbit of the satellite, each point of the Tropical belt is observed multiple times per day.

Footprint scale RH probability density functions are aggregated (convoluted) over the spatial and temporal scale of comparison to match the model resolution and summarize the patterns over a significant period. This method allows to use more sub-grid information by considering the finer-scale distributions as a whole. Thisprobabilistic approach avoids the classical determinist simplification consisting of working with a simple ”best” estimate. The resulting assessment is more contrasted while better adapted to the characterization of specific situations on a case-by-case study. It provides a significant added-value to the classical deterministic comparisons by accounting for additional information in the evaluation of the simulated field, especially for model simulations that are close to the traditional mean.

Comparison results will be shown over the April-May-June 2018 period for two configurations of the ARPEGE model (two parametrization schemes for convection). The probabilistic comparison is discussed with respect to a classical deterministic comparison of RH values.

How to cite: Radice, C., Brogniez, H., Kirstetter, P.-E., and Chambon, P.: Using satellite probabilistic estimates to assess modeled relative humidity : application to a NWP model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13778, https://doi.org/10.5194/egusphere-egu23-13778, 2023.

EGU23-14302 | PICO | ESSI3.6

era5cli: from a utility script to a reusable software package

Bart Schilperoort, Peter Kalverla, Barbara Vreede, Stefan Verhoeven, Fakhereh Alidoost, Yang Liu, Niels Drost, Jerom Aerts, and Rolf Hut

The ERA5 meteorological reanalysis dataset, from the European Centre for Medium-Range Weather Forecasts (ECMWF), is widely used in areas such as meteorology, hydrology and land-surface modelling. The Copernicus Climate Data Store (CDS) offers two options for accessing the data: a web interface and a Python API. However, automated downloading of the data requires advanced knowledge of Python, and can prove challenging to people less familiar with programming.

Many climate scientists have their own Python scripts to download data from the CDS, all responsible for their own creation and maintenance. A quick search for Python scripts that call the CDS API on GitHub yields 1802 results, and this is not even counting scripts stored privately. However, these are by and large not reusable. A few years ago we created era5cli, as a byproduct of a project we were working on, to try to break this pattern of single-use scripts. era5cli enables automated downloading of ERA5 data using a single command.

It is inefficient that everyone writes their own copy of the same, or at least similar code. That why we asked ourselves whether era5cli is still filling a niche and if so, what we could do to make it easier to re-use for others. In this presentation we give an overview of our recent efforts into turning era5cli from a utility script into a reusable software package.

Despite the relatively small size of era5cli, around 1000 lines of Python code and comments, maintenance is not trivial. Changes are occasionally made to ERA5 and the CDS, and new Python versions are released while old ones are deprecated. Users of era5cli have helped here by submitting fixes to issues they have found in a Github pull request, but still require guidance and/or approval of administrators. By reducing the maintenance load of era5cli, through targeted streamlining of the code and a clean-up of the repository, as well as adding to the developer instructions in the documentation, we lower the threshold for community contributions and successful future maintenance. With this, we aim to make era5cli future-proof.

era5cli can be installed using Python’s pip, as well as using conda/mamba (conda install era5cli -c conda-forge). The source code for era5cli is available on https://github.com/eWaterCycle/era5cli, and the documentation can be found on https://era5cli.readthedocs.io/.

How to cite: Schilperoort, B., Kalverla, P., Vreede, B., Verhoeven, S., Alidoost, F., Liu, Y., Drost, N., Aerts, J., and Hut, R.: era5cli: from a utility script to a reusable software package, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14302, https://doi.org/10.5194/egusphere-egu23-14302, 2023.

EGU23-14791 | ECS | PICO | ESSI3.6

How PostgreSQL and QField Cloud can streamline data collection and improve data security: experience from a collaborative, interdisciplinary water reuse project in Flanders, Belgium.

Mateusz Zawadzki and Marijke Huysmans

Data is at the heart of every project, and its quality determines the reliability of research outcomes. In ever more collaborative geoscientific research, where data comes from many sources and in various formats, the right tools must be used to ensure data integrity and seamless access for all involved.

Since 2019, in the project Grow, we routinely monitored groundwater quality and levels within an agricultural field where water is reused for irrigation and groundwater recharge. With multiple participants and analysis factors and fine temporal resolution of the monitoring, problems were encountered with a large volume of unstructured data with poor version control. Standard tools for collaborative research, such as cloud-based Excel spreadsheets, proved ineffective and threatened data integrity. A more robust data management system was urgently needed.

Here we provide an overview of a framework based on a popular, open-source PostgreSQL relational database management system deployed in Amazon Web Services that helps to overcome data management issues in groundwater monitoring projects. Among main features are user-based, minimum privilege access which protects the data from, e.g., accidental deletions, and a hardcoded set of data correctness checks decreasing the likeliness of data input errors. Field data is collected using QField Cloud mobile application running a preconfigured QGIS project, sending the data directly to the database. Users also have access to all historical records, helping them detect anomalies on the spot. Laboratory analysis results and data from automatic data loggers without Internet of Things (IoT) modules are processed and uploaded to the database using custom-developed, open-source Python software, providing full transparency. Several IoT devices upload data directly to the database.

So far, the new management system has proved a far superior platform for collaborative data analysis compared to existing tools. It significantly improved fieldwork efficiency and provided assurance of the data quality by improving the collection and handling process transparency. Thanks to the hard work of the QGIS and QField communities, as well as the developers and maintainers of PostgreSQL, we are better equipped for the future of geodata analysis.

How to cite: Zawadzki, M. and Huysmans, M.: How PostgreSQL and QField Cloud can streamline data collection and improve data security: experience from a collaborative, interdisciplinary water reuse project in Flanders, Belgium., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14791, https://doi.org/10.5194/egusphere-egu23-14791, 2023.

EGU23-15104 | PICO | ESSI3.6

Evaluating CMIP6 models with a simple climatology score

Lise Seland Graff, Kajsa M. Parding, and Oskar A. Landgren

When investigating multi-model ensembles it can be useful to evaluate model performance to make sure that the historical climate of the fields of interest is captured to a satisfactory degree. To this end we define a simple climatology score, based on the root-mean-square error (RMSE) of essential climate variables from the historical experiments of an ensemble of models participating in the sixth Coupled Model Intercomparison Project (CMIP6).

We consider four key variables: near-surface temperature, precipitation, 850-hPa zonal wind, and 850-hPa air temperature. The focus is on monthly climatologies of global values, but we also explore the sensitivity of the scores to changes in the regions and seasons considered.

The purpose of the score is to help identify models with relatively large errors in the representation of the variables of interest. This can be useful when considering models to include for storyline analysis or when selecting a subset of models for regional downscaling.

How to cite: Graff, L. S., Parding, K. M., and Landgren, O. A.: Evaluating CMIP6 models with a simple climatology score, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15104, https://doi.org/10.5194/egusphere-egu23-15104, 2023.

EGU23-15225 | PICO | ESSI3.6

DeepESDL – an open platform for research and collaboration in Earth Sciences

Gunnar Brandt, Alicja Balfanz, Norman Fomferra, Tejas Morbagal Harish, Miguel Mahecha, Guido Kraemer, David Montero, Stephan Meißl, Stefan Achtsnit, Josefine Umlauft, Anja Neumann, Alex Horton, Martin Ewart, Fabian Gans, and Anca Anghelea

The Deep Earth System Data Lab (DeepESDL, https://earthsystemdatalab.net) provides an AI-ready, collaborative environment enabling researchers to understand the complex dynamics of the Earth System using numerous datasets and multi-variate, empirical approaches. The solution builds on work done in previous projects funded by the European Space Agency (CAB-LAB and ESDL), which established the technical foundations and created measurable value for the scientific community, e.g., Mahecha et al. (2020, https://doi.org/10.5194/esd-11-201-2020) or Flach et al. (2018, https://doi.org/10.5194/bg-15-6067-2018 ). DeepESDL relies heavily on the well-established open-source technology stacks for data science in Python, thus ensuring usability and compatibility.

The core of the DeepESDL is represented by the provision of programmatic access to various data sources in analysis-ready form, organised in data cubes combined with adequate computational resources and capabilities to allow researchers to immediately focus on efficient analysis and of multi-variate and high-dimensional data through empirical methods or AI approaches.

To ensure proper documentation and discoverability, DeepESDL is building an informative catalogue to find all available data and to find the required metainformation describing them. This includes not only standard information, e.g., regarding spatial and temporal coverage, versioning, but also on specific transformation methods applied during data cube generation.

The system design has openness, collaboration, and dissemination as key guiding principles. As science teams need proper tooling support to efficiently work together in this virtual environment, one of the key elements of the architecture is represented by the DeepESDL Hub, providing teams of scientific users with the means for collaboration and exchange of versioned results, source codes, models, execution parameters, and other artifacts and outcomes of their activities in a simple, safe and reliable way. The tools are complemented by an integrated, state-of-the-art application for the visualisation of all data in the virtual laboratory including input data, intermediate results, as well as the final products.

Furthermore, the DeepESDL supports the implementation and execution of Machine Learning workflows on Analysis Ready Data Cubes in a reproducible and FAIR way, allowing sharing and versioning of all ML artifacts like code, data, models, execution parameters, metrics, and results as well as tracking each step in the ML workflows (supported by integration with Open-Source tools like TensorBoard or Mlflow) for an experiment so that others can reproduce them and contribute.

Finally, dissemination is essential for the Open Science spirit of the DeepESDL. Two applications, xcube Viewer and 4D viewer, offer comprehensive user interfaces for interactive exploration of multi-variate data cubes. Both use the same RESTful data service API provided by xcube Server. The latter also provides OGC interfaces, so that other OGC-compliant applications, such as QGIS3, are able to visualise analysis-ready data cubes generated within DeepESDL.

To foster collaboration, additional features such as publishing individual Jupyter Notebooks as storytelling documents or even books using Jupyter Books or the Executable Book Project are being explored, together with concepts such as storytelling and DeepESDL User Project Dashboards which may also link to the viewers and Notebooks.

How to cite: Brandt, G., Balfanz, A., Fomferra, N., Morbagal Harish, T., Mahecha, M., Kraemer, G., Montero, D., Meißl, S., Achtsnit, S., Umlauft, J., Neumann, A., Horton, A., Ewart, M., Gans, F., and Anghelea, A.: DeepESDL – an open platform for research and collaboration in Earth Sciences, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15225, https://doi.org/10.5194/egusphere-egu23-15225, 2023.

EGU23-15272 | PICO | ESSI3.6

Climix—a flexible suite for the calculation of climate indices

Klaus Zimmermann, Lars Bärring, Joakim Löw, and Carolina Nilsson

Climate indices have long been an important tool for the evaluation of climate impacts and are commonly used both in research and practical risk assessment. The term generally refers to advanced statistics derived from daily data, such as the longest spell of consecutive dry days in a year, or the maximal precipitation in a single day in a given month. Initially, these indices were calculated for station data, but with the advent and increased utility of global and regional climate models, they are now commonly calculated for gridded data. The increased spatial resolution, together with the increased length of simulations to hundreds of years, the increased use of ever-larger ensembles of climate simulations, and the interest in a wider selection of possible future climate scenarios renders some established, serial algorithms ineffective. This is compounded by the fact that modern computing architectures derive their growing power no longer from the speedup of single computing units, but rather from the integration of larger numbers of parallel computing units. To fully utilize this potential, it is not enough to implement a straightforward parallelization of existing algorithms. Rather, we need to rethink the computing task from the start in a parallel framework.

Here, we present parallel algorithms implemented in the Python framework Climix, that have proven useful in the calculation of climate indices for a large ensemble of climate simulations that provide the basis for the user-oriented climate service of the Swedish Meteorological and Hydrological Institute.

Climix is available as open-source software and allows the calculation of a large number of climate indices both from a command-line interface with good support for common HPC schedulers such as SLURM and via a flexible Python API.

How to cite: Zimmermann, K., Bärring, L., Löw, J., and Nilsson, C.: Climix—a flexible suite for the calculation of climate indices, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15272, https://doi.org/10.5194/egusphere-egu23-15272, 2023.

EGU23-15399 | ECS | PICO | ESSI3.6

The Riverplume Workflow - Impact of riverine extreme events on coastal biogeochemistry

Viktoria Wichert, Holger Brix, and Doris Dransch

Fluvial extreme events, such as floods and droughts, have an impact beyond the river bed. The change in river discharge and concentration of nutrients and pollutants in freshwater also affects coastal waters, esp. their biogeochemistry. Examining these impacts has been traditionally difficult, as one needs to first detect the river plume in the seawater and then infer its spatio-temporal extent. The River Plume Workflow was developed to support researchers with these tasks and enable them to identify regions of interest, as well as provide tools to conduct a preliminary analysis of the riverine extreme events’ impacts on the coastal waters.

The Riverplume Workflow is an open source software tool to detect and examine freshwater signals as anomalies in marine observational data. Data from a FerryBox, an autonomous measuring device installed on a commercial ferry, provide regular coverage of the German Bight, the region for which we developed this toolbox. Combined with drift model computations, it is possible to detect anomalies in the observational data and to comprehend their propagation and origin.

The Riverplume Workflow uses the Data Analytics Software Framework (DASF) that was developed as part of the Digital Earth project. Through its modular structure, DASF supports collaborative and distributed data analysis. The Riverplume Workflow’s main feature is an interactive map with various data visualization options that allows users to examine the data closely and either manually select a presumed anomaly for analysis or use an automatic anomaly detection algorithm based on Gaussian regression. The Workflow offers a statistical analysis feature to compare the composition of the selected data to the surrounding measurements. Simulated trajectories of particles starting on the FerryBox transect at the time of the original observation and modelled backwards and forwards in time help verify the origin of the river plume and allow users to follow the anomaly across their area of interest. In addition, the workflow offers the functionality to assemble satellite-based chlorophyll observations along model trajectories as a time series. They allow scientists to understand processes inside the river plume and to determine the timescales on which these developments happen.

The FerryBox data used in the Riverplume Workflow are pre-processed automatically and updated daily. Synoptic drift model data is provided for all Elbe extreme events since 2013. We plan to automatize the provision of model data as well.

We currently use the Riverplume Workflow to monitor the impacts of Elbe extreme events in the German Bight, though we plan to adapt it to other regions or types of anomalies. The Workflows’ code and all components are available under open source licenses and registered under the DOI https://doi.org/10.5880/GFZ.1.4.2022.006.

How to cite: Wichert, V., Brix, H., and Dransch, D.: The Riverplume Workflow - Impact of riverine extreme events on coastal biogeochemistry, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15399, https://doi.org/10.5194/egusphere-egu23-15399, 2023.

EGU23-16145 | ECS | PICO | ESSI3.6

Gap-filled multivariate observations of global land-climate interactions

Verena Bessenbacher, Lukas Gudmundsson, Martin Hirschi, and Sonia I. Seneviratne

The volume of Earth system observations from space and ground has massively grown in recent decades. Despite this increasing abundance, multivariate or multi-source analyses at the interface of Atmosphere and Land are however still hampered by the sparsity of ground observations and a large number of missing values in satellite observations. In particular, there are many instances where some variables are observed at a particular time and location, while others are not available, thereby hindering robust analysis. Gap-filling is hence necessary but often done implicitly or for only single variables which can obscure physical dependencies. Here we use CLIMFILL (CLIMate data gap-FILL), a recently developed multivariate gap-filling procedure to bridge this gap. CLIMFILL combines state-of-the-art spatial interpolation with a statistical gap-filling method designed to account for the dependence across multiple gappy variables. CLIMFILL is applied to a set of remotely sensed and in-situ observations over land that are central to observing land-atmosphere feedbacks and extreme events. The resulting gridded time series spans the years 1995-2020 globally on a 0.25-degree resolution with monthly gap-free maps of nine variables including ESA CCI surface layer soil moisture, MODIS land surface temperature, diurnal temperature range, GPM precipitation, GRACE terrestrial water storage, ESA CCI burned area, ESA CCI snow cover fraction as well as two-meter temperature and precipitation from in-situ observations. Internal verification shows that this dataset can recover time series of anomalies better than state-of-the-art interpolation methods. It shows high correlations with respective variables of ERA5-Land and can help elongate and gap-fill ESA CCI surface layer soil moisture timelines for comparison with ISMN station observations. We showcase key features of the newly developed data product using three major fire seasons in California, Australia, and Europe. Their their accompanying droughts and heatwaves are well represented and can serve as a gap-free completion of an otherwise fragmented observational picture of these events. The dataset will be made freely available and can serve as a step towards the fusion of multi-source observations to create a Digital Twin of the Earth.

How to cite: Bessenbacher, V., Gudmundsson, L., Hirschi, M., and Seneviratne, S. I.: Gap-filled multivariate observations of global land-climate interactions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16145, https://doi.org/10.5194/egusphere-egu23-16145, 2023.

EGU23-1099 | Orals | GI2.3

Theia/OZCAR Thesaurus: a terminology service to facilitate the discovery, interoperability and reuse of data from continental surfaces and critical zone science in interdisciplinary research

Isabelle Braud, Charly Coussot, Véronique Chaffard, and Sylvie Galle

Understanding, modeling and predicting the future of the Earth System in response to global change is a challenge for the Earth system scientific community, but a necessity to address pressing societal needs related to the UN Sustainable Development Goals and risk monitoring and prediction. These “wicked” environmental problems require the building of integrated modeling tools . The latter will only provide reliable response if they integrate all existing multi-disciplinary data sources. Open science and data sharing using the FAIR (Findable, Accessible, Interoperable, Reusable) principles provide the framework for such data sharing. However, when trying to put it into practice, we face a large fragmentation of the landscape, with different communities having developed their own data management systems, standards and tools.

When starting to work on the Theia/OZCAR Information System (IS) that aims to Facilitate the discovery, to make FAIR, in-situ data of continental surfaces collected by French research organizations and their foreign partners, we performed a “Tour de France” to understand the critical zone science users’ needs when searching for data. The common criterion that emerged was the variables names. We believe that this need is general to all disciplines involved in Earth System sciences and is all the more important when data is searched by scientists of other disciplines that are not familiar with the vocabularies of the other communities. This abstract aim is to share our experience in building the tools aiming at harmonizing and sharing variables names using FAIR principles.

In the Theia/OZCAR critical zone research community, long term observatories that produce the data have heterogeneous data description practices and variable names. They may be different for the same variable (i.e.: "soil moisture", "soil water content", "humidité des sols", etc.). Moreover, it is not possible to infer automatically or semi-automatically similarities between these variables names. In order to identify these similarities and implement data discovery functionalities on these dimensions in the IS, we built the Theia/OZCAR variable thesaurus. To enable technical interoperability of the thesaurus, it is published on the web using the SKOS vocabulary description standard. Other thesauri used in environmental sciences in Europe and worldwide have been identified and the definition of associative relationships with these vocabularies ensures the semantic interoperability of the Theia/OZCAR thesaurus. However, it is quite common that the variable names used for the search dimensions remain general (e.g. "soil moisture") and are not specific enough for the end user to interpret exactly what has been measured (e.g. "soil moisture at 10 cm depth measured by TDR probe"). Therefore, to improve data reuse and interoperability, the thesaurus now follows a recommendation of the Research Data Alliance and implements the I-ADOPT framework to describe the variables more precisely. Each variable is composed and described by relationships with atomic concepts whose definition is specified. The use of these atomic concepts enhances interoperability with other catalogues or services and contributes to the reuse of the data by other communities that those who collected them.

How to cite: Braud, I., Coussot, C., Chaffard, V., and Galle, S.: Theia/OZCAR Thesaurus: a terminology service to facilitate the discovery, interoperability and reuse of data from continental surfaces and critical zone science in interdisciplinary research, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1099, https://doi.org/10.5194/egusphere-egu23-1099, 2023.

EGU23-1294 | Posters on site | GI2.3

A data integration system for ocean climate change research in the Northwest Pacific

Sung Dae Kim, Hyuk Min Park, Young Shin Kwon, and Hyeon Gyeong Han

A data integration and processing system was established to provide long-time data and real-time data to the researcher who are interested in long-term variation of ocean data in the Northwest Pacific area. All available ocean data of 6 variables (ocean temperature, salinity, dissolved oxygen, ocean CO2, nutrients) in the NWP area (0°N - 65°N, 95°E - 175°E) are collected from the Korean domestic organizations (KIOST, NFIS, KHOA, KOEM), the international data systems (WOD, GTSPP, SeaDataNet, etc.), and the international observation networks (Argo, GOSHIP, GLODAP, etc.). Total number of data collected is over 5 millions and observation dates are from 1938 to 2022. After referring to several QC manuals and related papers, QC procedures and test criteria for 6 data items were determined and documented. Several Matlab programs complying with QC procedures were developed and used to check quality of all collected data. We excluded duplicated data from the data set and saved them in 0.25° grid data files. Long-term average over 40 years and standard deviation of data at each standard depths and grid point were calculated. All quality controlled data, qc flag, average, standard deviation of each ocean variables are saved in format of netCDF and provided to ocean climate researchers and numerical modelers. We also have 2 plans using the collected data from 2023 to 2025. The one is production of long-term grid data set focused on the NWP area, the other is developing a data service system providing observation data and reanalysis data together.

Acknowledgement : This research was supported by Korea Institute of Marine Science & Technology Promotion(KIMST) funded by the Ministry of Oceans and Fisheries(KIMST-20220033)

How to cite: Kim, S. D., Park, H. M., Kwon, Y. S., and Han, H. G.: A data integration system for ocean climate change research in the Northwest Pacific, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1294, https://doi.org/10.5194/egusphere-egu23-1294, 2023.

EGU23-1599 | Posters on site | GI2.3

Overview of the services provided to marine data producers by ODATIS, the French ocean data center

Sabine Schmidt, Erwann Quimbert, Marine Vernet, Joël Sudre, Caroline Mercier, Dominique Obaton, Jean-François Piollé, Frédéric Merceur, Gérald Dibarboure, and Gilbert Maudire

The consequences of global change on the ocean are multiple such as increase in temperature and sea level, stronger storms, deoxygenation, impacts on ecosystems. But the detection of changes and impacts is still difficult because of the diversity and variability of marine environments. While there has been a clear increase in the number of marine and coastal observations, whether by in situ, laboratory or remote sensing measurements, each data is both costly to acquire and unique. The number and variety of data acquisition techniques require efficient methods of improving data availability via interoperable portals, which facilitate data sharing according to FAIR principles for producers and users. ODATIS, the ocean cluster of Data Terra, the French research infrastructure for Earth data, is the entry point to access all the French Ocean observation data (Ocean Data Information and Services ; www.odatis-ocean.fr/en/). The first challenge of ODATIS is to get data producers to share data. To that purpose, ODATIS offers several services to help them define Data Management Plan (DPM), implement the FAIR principles, make data more visible and accessible by being referenced in the ODATIS catalog, and better tracked and cited through a Digital Object Identifier (DOI). ODATIS also offers a service for publishing open scientific data on the sea, through SEANOE (www.seanoe.org) that provides a DOI that can be cited in scientific articles in a reliable and sustainable way. In parallel to the informatic development of the ocean cluster, further communication and training are needed to inform the research community of these new tools. Through technical workshops, Odatis offers data providers practical experience and support in implementing data access, visualization and processing services. Finally, ODATIS relies on scientific consortia in order to promote and develop innovative processing methods and products for remote, airborne, or in situ observations of the ocean and its interfaces (atmosphere, coastline, seafloor) with the other clusters of the RI Data Terra.

How to cite: Schmidt, S., Quimbert, E., Vernet, M., Sudre, J., Mercier, C., Obaton, D., Piollé, J.-F., Merceur, F., Dibarboure, G., and Maudire, G.: Overview of the services provided to marine data producers by ODATIS, the French ocean data center, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1599, https://doi.org/10.5194/egusphere-egu23-1599, 2023.

EGU23-5626 | Orals | GI2.3

An integration of digital twin technology, GIS and VR for the service of environmental sustainability

Chen Wang, David Miller, Alessandro Gimona, Maria Nijnik, and Yang Jiang

A digital twin is a digital representation of real-world physical product, system, or process. Digital twins potentially offer a much richer capability to model and analyze real-world systems and improve environment sustainability.

In this work, an integrated 3D GIS and VR model for scenarios modeling and interactive data visualisation has been developed and implemented through the Digital Twin technology at the Glensaugh research farm. Spatial Multi-criteria Analysis has been applied to decide where to plant new woodlands, recognizing a range of land-use objectives while acknowledging concerns about possible conflicts with other uses of the land. The virtual contents (e.g., forest spatial datasets, monitored climate data, analyzed carbon stocks and natural capital asset index) have been embedded in the virtual landscape model which help raise public awareness of changes in rural areas.

The Digital twin prototype for Glensaugh Climate-Positive Farming was used at the STFC workshop 2021, GISRUK 2022, 2022 Royal Highland Show which provides an innovative framework to integrate spatial data modelling, analytical capabilities and immersive visualization.

Audience feedback suggested that the virtual environment was very effective in providing a more realistic impression of the different land-use and woodland expansion scenarios and environmental characteristics. This suggests considerable added value from using digital twin technology to better deal with complexity of data analysis, scenarios simulation and enable rapid interpretation of solutions.

Findings show this method has a potential impact on future woodland planning and enables rapid interpretation of forest and climate data which increases the effectiveness of their use and contribution to wider sustainable environment.

How to cite: Wang, C., Miller, D., Gimona, A., Nijnik, M., and Jiang, Y.: An integration of digital twin technology, GIS and VR for the service of environmental sustainability, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5626, https://doi.org/10.5194/egusphere-egu23-5626, 2023.

EGU23-5866 | ECS | Posters on site | GI2.3

Mapping and Analysis of Anthrax Cases in Humans and Animals

Tamar Chichinadze, Zaza Gulashvili, Nana Bolashvili, Lile Malania, and Nikoloz Suknidze

Anthrax is a rare but serious disease caused by gram-positive, stem-shaped bacteria Bacillus anthracis, which are toxin-producing, encapsulated, facultative anaerobic organisms. Anthrax is found naturally in the soil and mainly harms livestock and wildlife. It can cause serious illness in both humans and animals. Anthrax, an often fatal disease of animals, is spread to humans through contact with infected animals or their products. People get infected with anthrax when spores get into the body.

The study aims to monitor the anthill localization map of anthrax on geographical maps and identify geographical variables that are significantly associated with environmental risk factors for anthrax recurrence in Georgia (Caucasus), as specific diseases affect the geographical environment, soil, climate. etc.

We carefully analyzed a set of 1664 cases of anthrax in humans and 621 cases of anthrax in animals, up to 1430 locations in anthrax foci (animal burial sites, slaughterhouses, BP roads, construction, etc.) observed in Georgia. Literature and the National Center for Disease Control for over 70 years. We analyzed more than 30 geographical variables such as climate, topography, soil (soil type, chemical composition, acidity), landscape, etc., and created several digital thematic maps, and foci of ant distribution and detection. The identified variable will help you to monitor anthrax development foci.

How to cite: Chichinadze, T., Gulashvili, Z., Bolashvili, N., Malania, L., and Suknidze, N.: Mapping and Analysis of Anthrax Cases in Humans and Animals, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5866, https://doi.org/10.5194/egusphere-egu23-5866, 2023.

EGU23-6357 | Posters on site | GI2.3

PANAME: a portal laboratory for city's environmental data

Vincent Douet, Sophie Bouffiès-Cloché, Joanne Dumont, Martial Haeffelin, Jean-Charles Dupoont, Simone Kotthaus, Valéry Masson, Aude Lemonsu, Valerie Gros, Christopher Cantrell, Vincent Michoud, and Sébastien Payan

The urban is at the heart of many disciplinary projects covering very broad scientific areas. Acquired data or simulations are often accessible (when they are) via targeted thematic portals. However, the need for transdisciplinarity has been essential for several years to answer specific scientific questions or societal demands. For this, the crossing of human sciences data, health, air quality, land use, emissions inventories, biodiversity, etc., would allow new innovative studies in connection with the city.

PANAME (PAris region urbaN Atmospheric observations and models for Multidisciplinary rEsearch) developed by AERIS was designed as the first brick of a data portal that can promote the discovery, access, cross-referencing and representation of urban data from various sectors with air quality and urban heat islands as a starting point. The portal and future developments will be discussed in this presentation.

How to cite: Douet, V., Bouffiès-Cloché, S., Dumont, J., Haeffelin, M., Dupoont, J.-C., Kotthaus, S., Masson, V., Lemonsu, A., Gros, V., Cantrell, C., Michoud, V., and Payan, S.: PANAME: a portal laboratory for city's environmental data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6357, https://doi.org/10.5194/egusphere-egu23-6357, 2023.

EGU23-6873 | Posters on site | GI2.3

From local to global: Community services in interdisciplinary research data management

Hela Mehrtens, Janine Berndt, Klaus Getzlaff, Andreas Lehmann, and Sören Lorenz

GEOMAR research covers a unique range of physical, chemical, biological and geological ocean processes. The department Digital Research Services develops and provides advice and tools to support scientific data workflows, including metadata description of expeditions, model experiments, lab experiments, and samples. Our focus lies on standardized internal data exchange in large interdisciplinary scientific projects and citable data and software publications in discipline specific repositories to meet the FAIR principles. GEOMAR aims at providing their services not only internally but as a collaborative RDM platform for marine projects as a community service. How to achieve this on the operational level is currently worked on jointly with other research institutions in community projects, e.g. within the DAM (German Alliance of Marine Research), the DataHUB, an initiative of several research centres within the Helmholtz research area Earth and Environment, and within the national research infrastructure NFDI4Earth, a network of more than 60 partners.

Our latest use cases are the inclusion of the seismic data and numerical model simulations into the community portals to increase their visibility and reusability. We present the success stories and pitfalls of bringing a locally well established system in larger communities and address the challenges we are facing.

How to cite: Mehrtens, H., Berndt, J., Getzlaff, K., Lehmann, A., and Lorenz, S.: From local to global: Community services in interdisciplinary research data management , EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6873, https://doi.org/10.5194/egusphere-egu23-6873, 2023.

EGU23-7015 | ECS | Orals | GI2.3

Evaluation of five reanalysis products over France: implications for agro-climatic studies

Mariam Er-rondi, Magali Troin, Sylvain Coly, Emmanuel Buisson, Laurent Serlet, and Nourddine Azzaoui

Agriculture is extremely vulnerable to climate change. Increase in air temperature alongside the more frequent extreme climate events are the main climate change’s negative impacts influencing the yields, safety, and quality of crops. One approach to assess the impacts of climate change on agriculture is the use of agro-climatic indicators (AgcIs). Agcls characterize plant-climate interactions and are practical and understandable for both farmers and decision makers.

Climate and climate change impact studies on crop require long samples of reliable past and future datasets describing both spatial and temporal variability. The lack of observed historical data with an appropriate temporal resolution (i.e., 30 years of continuous daily data) and a sufficient local precision (i.e., 1km) is a major concern. To overcome that, the reanalysis products (RPs) are often used as a potential reference data of observed climate in impact studies. However, RPs have some limitations as they contain some biases and uncertainties. In addition, the RPs’ evaluation is often conducted on climate indicators which raises questions about their suitability for agro-climatic indicators.

This work aims to evaluate the ability of five of the most used RPs to reproduce observed AgcIs for three specific crops (i.e., apple, corn, and vine) over France. The five RPs selected for this study are the SCOPE Climate, FYRE Climate, ERA5, ERA5 Land and the gridded dataset RFHR. They are compared to the SYNOP meteorological data provided by Météo-France, considered as a reference dataset from 1996 to 2021.

Our findings show a higher agreement between the five RPs and SYNOP for the temperature-based Agcls than the precipitation-based Agcls. RPs tend to overestimate the precipitation-based Agcls. We also note that, for each RP, the discrepancies between the AgcIs and the reference SYNOP dataset do not depend on the geographical location or the crop. This study emphasizes the need to quantify uncertainty in climate data in climate variability and climate change impact studies on agriculture.

How to cite: Er-rondi, M., Troin, M., Coly, S., Buisson, E., Serlet, L., and Azzaoui, N.: Evaluation of five reanalysis products over France: implications for agro-climatic studies, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7015, https://doi.org/10.5194/egusphere-egu23-7015, 2023.

EGU23-7703 | Orals | GI2.3

Cloud native OpenScienceLabs for HPC : Easing the road to FAIR collaboration and OpenSource

Constanze Roedig and Kareem Sorathia

We present a method for publishing high performance compute (HPC) code and results in a scalable, portable and ready-to-use interactive environment in order to enable sharing, collaborating, peer-reviewing and teaching. We show how we utilize cloud native elements such as kubernetes, containerization, automation and webshells to achieve this and demonstrate such an OpenScienceLab for the MAGE (Multiscale Atmosphere Geospace Environment) model, being developed by the recently selected NASA DRIVE Center for Geospace Storms.
We argue that a key factor in the successful design of such an environment is its (cyber)-security, as these labs require non-trivial compute resources open to a vast audience. Benefits as well as implied costs of different hosting options are discussed, comparing public cloud, hybrid, private cloud and even large desktops.
We encourage HPC centers to test our method using our fully open source blueprints. We hope to thus unburden the research staff and scientists to follow FAIR principles and support open source goals without needing a deep knowledge of cloud computing.

How to cite: Roedig, C. and Sorathia, K.: Cloud native OpenScienceLabs for HPC : Easing the road to FAIR collaboration and OpenSource, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7703, https://doi.org/10.5194/egusphere-egu23-7703, 2023.

EGU23-8585 | Orals | GI2.3

Programmatic Update for NASA’s Commercial Smallsat Data Acquisition (CSDA) Program

Aaron Kaulfus, Alfreda Hall, Manil Maskey, Will McCarty, and Frederick Policelli

Established in 2017 as a pilot project, the NASA Commercial Smallsat Data Acquisition (CSDA) Program evaluates and acquires commercial datasets that compliment NASA Earth Science research and application goals. The success of the pilot and recognition of the value commercial data provide to the scientific community led to establishment of a sustained program within NASA’s Earth Science Division (ESD) with objectives of providing continuous on-ramp of new commercial vendors to evaluate the potential to advance NASA’s Earth science research and application activities, enable sustained use of the purchased data by the scientific community, ensure long-term preservation of purchased data for scientific reproducibility, and coordinate with other U.S. Government agencies and international partners on the evaluation and use of commercial data. This presentation will focus on data made available for scientific use through the CSDA Program, especially those datasets added since the conclusion of the original pilot project, describe the process for end users to access of CSDA managed datasets, and provide a status overview of ongoing and upcoming vendor evaluation activities will be given. Recent scientific research results from CSDA subject matter experts utilizing commercial data will also be provided.

How to cite: Kaulfus, A., Hall, A., Maskey, M., McCarty, W., and Policelli, F.: Programmatic Update for NASA’s Commercial Smallsat Data Acquisition (CSDA) Program, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8585, https://doi.org/10.5194/egusphere-egu23-8585, 2023.

EGU23-12144 | Orals | GI2.3

FAIR & Open Material Samples: The IGSN ID

Rorie Edmunds

Material samples are a vital output of the scientific endeavour. They underpin research in the Earth, Space, and Environmental Sciences, and are a necessary component of ensuring the transparency and reproducibility of such research. While there has been a lot of discussion in recent years about the openness and FAIRness of data, code, methods, and so on, material samples have been much less under the spotlight.

The lack of focus on material samples is in part due to them being unique as a research output, in the sense that they are inherently physical and thus they are mostly transported and managed by human beings rather than machines; it is rather more straightforward to archive and share both information about an output—and the output itself—for something that is already a digital object. However, it is for this reason that materials samples must be made more FAIR and treated as first-class citizens of Open Science. To do this, one needs to connect the physical and digital worlds. IGSN IDs enable these connections to be made.

The IGSN ID is a globally unique and persistent identifier (PID) specifically for labelling material samples themselves (i.e., they are for neither images nor data about a sample). Functionally a Digital Object Identifier (DOI) registered under DataCite services, the IGSN ID can be applied to all types of material samples coming from any discipline. Not only can IGSN IDs be used to identify individual material samples that currently exist in a repository, museum, or otherwise, but they can also be registered

At the aggregate level for sample collections.
For the sites from which the samples are taken.
For ephemeral samples.

Importantly, in all cases, when registering an IGSN IDs, one must supply metadata in the DataCite Metadata Schema, as well as create landing pages that supply additional, disciplinary, user-focussed information about the collection, site, or (sub)sample. Hence, by registering a PID for a physical object, it is given a permanently resolvable URI to a findable and accessible digital footprint, and through the provision of rich metadata, enables its interoperability and reusability. Sharing of associated data is also possible within the metadata, and one may even include the potential for relocation of a sample itself for reuse.

This presentation will briefly introduce the IGSN ID and the partnership between DataCite and the IGSN e.V. to transfer the IGSN PID infrastructure under DataCite DOI services. It will mainly highlight practical use cases of IGSN IDs, including what is needed to include them in the sample workflow. It will also talk about efforts to better support IGSN IDs and sample metadata within the DataCite Metadata Schema.

How to cite: Edmunds, R.: FAIR & Open Material Samples: The IGSN ID, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12144, https://doi.org/10.5194/egusphere-egu23-12144, 2023.

EGU23-12173 | Orals | GI2.3

ESPRESSO: Earth Science Problems for the Evaluation of Strategies, Solvers and Optimizers

Andrew Valentine, Jiawen He, Juerg Hauser, and Malcolm Sambridge

Many Earth systems cannot be observed directly, or in isolation. Instead, we must infer their properties and characteristics from their signature in one or more datasets, using a variety of techniques (including those based on optimization, statistical methods, or machine learning). Development of these techniques is an area of focus for many geoscience researchers, and methodological advances can be instrumental in enhancing our understanding of the Earth.

In our experience, progress is substantially hindered by the absence of infrastructure facilitating communication between sub-disciplines. Researchers tend to focus on one area of the earth sciences — such as seismology, hydrology or oceanography — with only slow percolation of ideas and innovations from one area to another. Indeed, silos often exist even within these subfields. Testing new ideas on new problems is challenging as it requires the acquisition of domain knowledge, an often difficult and time-consuming endeavour with uncertain returns. Key questions that arise include: What is a relevant field data set, and how has it been processed? Which simulation package is most appropriate to predict the data? What would a 'good' model look like and what should it be able to resolve? What is the current best practice?

To address this, we introduce the ESPRESSO project — a collection of Earth Science Problems for the Evaluation of Strategies, Solvers and Optimisers. It aims to provide access to a suite of ‘test problems’, spanning a wide range of inference and inversion scenarios. Each test problem defines appropriate dataset(s) and simulation routines, accessible within a standardised Python interface. This will allow researchers to rapidly test new techniques across a spectrum of problems, share domain-specific inference problems and ultimately identify areas where there may be potential for fruitful collaboration and development. ESPRESSO is envisaged as an open, community-sourced project, and we invite contributions from across the geosciences.

How to cite: Valentine, A., He, J., Hauser, J., and Sambridge, M.: ESPRESSO: Earth Science Problems for the Evaluation of Strategies, Solvers and Optimizers, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12173, https://doi.org/10.5194/egusphere-egu23-12173, 2023.

EGU23-12381 | ECS | Posters on site | GI2.3 | Highlight

An Exploratory Study on the Methodology for the Analysis of Urban Environmental Characteristics in Seoul City based on S-Dot Sensor Data

Daehwan Kim, Kwanchul Kim, Dasom Lee, Jae-Hoon Yang, Seong-min Kim, and Jeong-Min Park

This paper identifies the aspects of living environment elements (PM^2.5, PM¹⁰, Noise) throughout Seoul and the urban planning characteristics that affect them by utilizing the big data of the S-Dot sensor in Seoul, which has recently become a hot topic. In other words, it proposes a big data-based research methodology and research direction to confirm the relationship between urban characteristics and environmental sectors that directly affect citizens. The temporal range is from 2020 to 2022, which is the available range of time series data for S-Dot sensors, and the spatial range is throughout Seoul by 500m*500m GRID. First of all, as part of analyzing specific living environment patterns, simple trends through EDA are identified, and cluster analysis is conducted based on the trends. After that, in order to derive specific urban planning characteristics of each cluster, basic statistical analysis such as ANOA and OLS, and MNL analysis were conducted to confirm more specific characteristics. As a result of this study, cluster patterns of PM^2.5, PM¹⁰, noise and urban planning characteristics that affect them are identified, and there are areas with relatively high or low long-term living environment values compared to other regions. The results of this study are believed to be a reference for urban planning management measures for vulnerable areas of living environment, and it is expected to be an exploratory study that can provide directions to studies related to data in various fields related to environmental data in the future.

How to cite: Kim, D., Kim, K., Lee, D., Yang, J.-H., Kim, S., and Park, J.-M.: An Exploratory Study on the Methodology for the Analysis of Urban Environmental Characteristics in Seoul City based on S-Dot Sensor Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12381, https://doi.org/10.5194/egusphere-egu23-12381, 2023.

EGU23-13420 | ECS | Orals | GI2.3

Development of interoperable web applications for paleoclimate research

Alessandro Morichetta, Anne-Marie Lézine, Aline Govin, and Vincent Douet

Studying how the Earth’s climate changed in the past requires a joint interdisciplinary effort of scientists from different scientific domains. Paleoclimatic records are increasingly obtained on multiple archives (e.g. marine and terrestrial sediments, ice cores, speleothems, corals) and they document past changes in various climatic variables of the different components of the climatic system (e.g. ocean, atmosphere, vegetation, ice).

Most paleoclimatic records still rely on independent observations with no standard format describing their data or metadata, resulting in a progressive increase of variables and taxonomies. Therefore, despite the achievements of the last decades (e.g. NOAA, NEOTOMA and PANGAEA databases), the lack of a common language strongly limits the systematic reusability of paleoclimate data, for example for the construction of paleoclimatic data syntheses or the evaluation of climate model simulations.

The international project “Abrupt Change in Climate and Ecosystems: Data and e-infrastructure” (ACCEDE, funded by the Belmont Forum) aims at creating an ecosystem for paleoclimatic data in order to investigate the tipping points of past climatic changes. In this context, the recently formalized Linked PaleoData (LiPD) format is the core for the standardization of paleoclimate data and metadata, and it is acting as communication protocol between the different databases that compose the e-infrastructure.

Here we show two web-based solutions that are part of this effort and that take advantage of the LiPD ecosystem. The African Pollen Database, and the IPSL Paleoclimate Database, both hosted and developed by Institut Pierre-Simon Laplace, France, have the objectives (1) to give open access, while respecting the FAIR principles, to a variety of paleoclimate datasets - from pollen fossils to various tracers measured on marine sediments, ice cores or tree rings -, and (2) to combine and compare, using visualization tools, carefully selected and well dated paleoclimatic records from different disciplines to address specific research questions.

The two databases are the result of data recovery from pre-existing and obsolete archives that followed a process of data (and metadata) consolidation, enrichment and formatting, in order to respect the LiPD specification and ensure the interoperability between them and the already existing databases. We designed harmonised web interfaces and REST APIs to explore and export existing datasets with the help of filtering tools. Datasets are published with DOI under an open license, allowing free access to the completeness of information. A LiPD upload form is embedded to the websites, in order to encourage both users and data stewards to propose, edit, add new records, and to bring the community into the use of LiPD format. We are currently working on finalizing visualization tools to evaluate aggregate data for research and education purposes.

With this effort we are developing a framework in which heterogeneous paleoclimatic records are fully interoperable, allowing scientists from the whole community to take advantage of the completeness of the available data, and to reuse them for very different research applications.

How to cite: Morichetta, A., Lézine, A.-M., Govin, A., and Douet, V.: Development of interoperable web applications for paleoclimate research, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13420, https://doi.org/10.5194/egusphere-egu23-13420, 2023.

EGU23-13455 | Posters on site | GI2.3

The Transnational access and training in the Geo-INQUIRE EU-project, an opportunity for researchers to develop leading-edge science at selected facilities and test-beds across Europe

Gaetano Festa, Shane Murphy, Mariusz Majdanski, Iris Christadler, Fabrice Cotton, Angelo Strollo, Marc Urvois, Volker Röhling, Stefano Lorito, Andrey Babeyko, Daniele Bailo, Jan Michalek, Otto Lange, Javier Quinteros, Mateus Prestes, and Stefanie Weege

The Geo-INQUIRE (Geosphere INfrastructure for QUestions into Integrated REsearch) project, supported by the Horizon Europe Programme, is aimed at enhancing services to make data and high-level products accessible to the broad Geoscience scientific community. Geo-INQUIRE’s goal is to encourage curiosity-driven studies into understanding the geosphere dynamics at the interface between the solid Earth, the oceans and the atmosphere using long data streams, high-performance computing and cutting-edge facilities.

In the framework of Geo-INQUIRE, Transnational Access (TA, both virtual and on-site) will be provided at six test beds across Europe: the Bedretto Laboratory, Switzerland; the Ella-Link Geolab, Portugal; the Liguria-Nice-Monaco submarine infrastructure, Italy/France; the Irpinia Near-Fault Observatory, Italy; the Eastern Sicily facility, Italy; and the Corinth Rift Laboratory, Greece. These test beds are state-of-the-art research infrastructures, covering the Earth’s surface, subsurface, and marine environments over different spatial scales, from small-scale experiments in laboratories to kilometric submarine fibre cables. The TA will revolve around answering scientific key-questions on the comprehension of fundamental processes associated with geohazards and georesources such as: the preparatory phases of earthquakes, the role of the fluids within the Earth crust, the fluid-solid interaction at the seabed, and the impact of geothermal exploitation. TA will be also offered for software and workflows belonging to the EPOS-ERIC and the ChEESE Centre of Excellence for Exascale in Solid Earth, to develop awarded user’s projects. These are grounded on simulation of seismic waves and rupture dynamics in complex media, tsunamis, subaerial and submarine landslides. HPC-based Probabilistic Tsunami, Seismic and Volcanic Hazard workflows are offered to assess hazard at high-resolution with extensive uncertainty exploration. Support and collaboration will be offered to the awardees to facilitate the access and usage of HPC resources for tackling geoscience problems. Geo-INQUIRE will grant TA to researchers to develop their own lab or numerical experiments with the aim of advancing scientific knowledge of Earth processes while fostering cross-disciplinary research across Europe. To be granted, researchers submit a proposal to the yearly TA calls that will be issued three times during the project life. Calls will be advertised at the Geo-INQUIRE web page https://www.geo-inquire.eu/ and through the existing community channels.

To encourage the cross-disciplinary research, Geo-INQUIRE will also organize a series of training and workshops, focused on data, data products and software delivered by research infrastructures, and useful for researchers. In addition, two summer schools will be organized, dedicated to cross-disciplinary interactions of solid earth and marine science.

The proposals, for both transnational access and training, will be evaluated by a panel that reviews the technical and scientific feasibility of the project, ensuring equal opportunities and diversity in terms of gender, geographical distribution and career stage. The first call is expected to be issued by the end of Summer 2023. The data and products generated during the TAs will be made available to the scientific community via the project’s strict adherence to FAIR principles.

How to cite: Festa, G., Murphy, S., Majdanski, M., Christadler, I., Cotton, F., Strollo, A., Urvois, M., Röhling, V., Lorito, S., Babeyko, A., Bailo, D., Michalek, J., Lange, O., Quinteros, J., Prestes, M., and Weege, S.: The Transnational access and training in the Geo-INQUIRE EU-project, an opportunity for researchers to develop leading-edge science at selected facilities and test-beds across Europe, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13455, https://doi.org/10.5194/egusphere-egu23-13455, 2023.

EGU23-14423 | Posters on site | GI2.3

EPOS-GNSS DATA GATEWAY: a portal to European GNSS Data and Metadata

Mathilde Vergnolle and Jean-Luc Menut

EPOS-GNSS is the Thematic Core Service dedicated to GNSS data and products for the European Plate Observing System.
EPOS-GNSS provides a service to explore and download validated and quality controlled data and metadata. This service is based on a network of 10 data nodes connected to a centralized portal, called "EPOS-GNSS Data Gateway". The service aims to follow the FAIR principles and continues to evolve to better meet them. It currently provides more than 4 millions of daily files in the RINEX standardized format for 1670 European GNSS stations and their associated metadata.
In addition to the integration into the multi-disciplinary EPOS data portal, the service proposes a direct access to the data and metadata for users with a need for more complex or more specific queries and filtering. A GUI (web client) and a specialized command line client are provided to facilitate the exploration and download of the data and metadata.
The presentation introduces the EPOS GNSS-Data Gateway (https://gnssdata-epos.oca.eu), its clients, and its use.

How to cite: Vergnolle, M. and Menut, J.-L.: EPOS-GNSS DATA GATEWAY: a portal to European GNSS Data and Metadata, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14423, https://doi.org/10.5194/egusphere-egu23-14423, 2023.

EGU23-14605 | Posters on site | GI2.3

Towards an interoperable digital ecosystem in Earth System Science research

Wolfgang zu Castell, Jan Bumberger, Peter Braesicke, Stephan Frickenhaus, Ulrike Kleeberg, Ralf Kunkel, and Sören Lorenz

Earth System Science (ESS) relies on the availability of data from varying resources and ranging over different disciplines. Hence, data sources are rich and diverse, including observatories, satellites, measuring campaigns, model simulations, case studies, laboratory experiments as well as citizen science etc. At the same time, practices of professional research data management (RDM) are differing significantly among various disciplines. There are many well-known challenges in enabling a free flow of data in the sense of the FAIR criteria. Such are data quality assurance, unique digital identifiers, access to and integration of data repositories, just to mention a few.

The Helmholtz DataHub Earth&Environment is addressing digitalization in ESS by developing a federated data infrastructure. Existing RDM practices at seven centers of the Helmholtz Association working together in a joint research program within the Research Field Earth and Environment (RF E&E) are harmonized and integrated in a comprehensive way. The vision is to establish a digital research ecosystem fostering digitalization in geosciences and environmental sciences. Hereby, issues of common metadata standards, digital object identifiers for samples, instruments and datasets, defined role models for data sharing certainly play a central role. The various data generating infrastructures are registered digitally in order to collect metadata as early as possible and enrich them along the flow of the research cycle.

Joint RDM bridging several institutions relies on professional practices of distributed software development. Apart from operating cross-center software development teams, the solutions rely on concepts of modular software design. For example, a generic framework has been developed to allow for quick development of tools for domain specific data exploration in a distributed manner. Other tools incorporate automated quality control in data streams. Software is being developed following guiding principles of open and reusable research software development.

A suite of views is being provided, allowing for varying user perspectives, monitoring data flows from sensor to archive, or publishing data in quality assured repositories. Furthermore, high-level data products are being provided for stakeholders and knowledge transfer (for examples see https://datahub.erde-und-umwelt.de). Furthermore, tools for integrated data analysis, e.g. using AI approaches for marine litter detection can be implemented on top of the existing software stack.

Of course, this initiative does not exist in isolation. It is part of a long-term strategy being embedded within national (e.g. NFDI) and international (e.g. EOSC, RDA) initiatives.

How to cite: zu Castell, W., Bumberger, J., Braesicke, P., Frickenhaus, S., Kleeberg, U., Kunkel, R., and Lorenz, S.: Towards an interoperable digital ecosystem in Earth System Science research, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14605, https://doi.org/10.5194/egusphere-egu23-14605, 2023.

EGU23-15072 | Posters virtual | GI2.3

Automated Extraction of Bioclimatic Time Series from PDF Tables

Sabino Maggi, Silvana Fuina, and Saverio Vicario

Since the development of the original specifications in the '90s the PDF document format has become the de-facto standard for the distribution and archival of documents in electronic form because of its ability to preserve the original layout of the documents, independently of the hardware, operating system and application software used to visualize them.

Unfortunately the PDF format does not contain explicit structural and semantic information, making it very difficult to extract structured information from them, in particular data presented in tabular form.
The automatic extraction of tabular data is a difficult and challenging task because tables can have extremely different formats and layouts, and involves several complex steps, from the proper recognition and conversion of printed text into machine-encoded characters, to the identification of logically coherent table constructs (headers, columns, rows, spanning elements), and to the breaking down of the data constructs into elemental objects.

Several tools have been developed to support the extraction process. In this work we survey the most interesting tools for the automatic detection and extraction of tabular data, analyzing their respective advantages and limitations. A particular emphasis is given on programmable open source tools because of their flexibility and long-term availability, together with the possibility to easily tweak them to meet the peculiar needs of the problem at hand.

As a practical application, we also present a workflow based on a set of R and AWK scripts that can automatically extract daily temperature and precipitation data from the official PDF documents made available each year by Regione Puglia, in Italy. The lessons learned from the development of this workflow and the possibility to generalize the approach to different kinds of PDF documents are also discussed.

How to cite: Maggi, S., Fuina, S., and Vicario, S.: Automated Extraction of Bioclimatic Time Series from PDF Tables, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15072, https://doi.org/10.5194/egusphere-egu23-15072, 2023.

EGU23-15293 | ECS | Posters virtual | GI2.3 | Highlight

Environmental parameters as a critical factor in understanding mosquito population

Anastasia Angelou, Sandra Gewehr, Spiros Mourelatos, and Ioannis Kioutsioukis

The transmission of West Nile Virus is known to be affected by multiple factors related to the behavior and interactions between reservoir (birds), vector (Culex-mosquitos), and hosts (humans). Environmental parameters can play a critical role in understanding WNV epidemiology. The aim of this research was to determine the association of various climatic factors with the Culex mosquito abundance in Greece during the period 2011-2022. Climate data were acquired from ERA5 (European Centre for Medium-Range Weather Forecasts), while Culex abundance data were obtained through the mosquito surveillance network of ECODEVELOPMENT S.A, who hold the biggest mosquito surveillance network in Greece. The research was conducted at the municipality level. Culex abundance depends in a nonlinear fashion from temperature (Figure 1). The spread of the measurements indicates however there are other factors that affect the abundance of mosquitoes.

Figure 1 Scatter plot of air temperature VS Culex abundance in a municipality (Delta) with relatively sizeable mosquito population.

Correlation heatmaps were used as a tool to visualize the correlation of vector abundance and average monthly temperature up to 2 months before at several municipalities in the Region of Central Macedonia. The correlations decrease with increasing the lag in temperature (Figure 2). Moreover, there are some municipalities in which the correlation coefficient is considerably greater than others. Those correlations cannot be explained without considering the mosquito breeding sites found in these municipalities. In these municipalities there is a presence of important water resources, such as rice paddies, drainage canals, wetland systems or a combination of all the above. When surface waters warm and the outside temperature rises, the mosquito life cycle is completed more quickly, resulting in more generations being produced in a shorter period of time.

Figure 2 Correlation heatmap of the correlation coefficient between the mosquito abundance (municipality scale) and the average monthly temperature up to 2 months before.

Scatterplots and correlation heatmaps calculated with the Culex abundance and total precipitation, relative humidity or wind speed did not reveal similar patterns. Ongoing analysis focuses in more factors, environmental and not, which affect the abundance of mosquitoes that transmit WNV.

Acknowledgments
This research has been co‐financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH – CREATE – INNOVATE (project code: Τ2ΕΔΚ-02070).

How to cite: Angelou, A., Gewehr, S., Mourelatos, S., and Kioutsioukis, I.: Environmental parameters as a critical factor in understanding mosquito population, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15293, https://doi.org/10.5194/egusphere-egu23-15293, 2023.

EGU23-15863 | Orals | GI2.3

Building an Open Source Infrastructure for Next Generation End User Climate Services

Benedikt Gräler, Katharina Demmich, Johannes Schnell, Merel Vogel, Stefano Bagli, and Paolo Mazzoli

Climate Services (CS) are crucial in empowering citizens, stakeholders and decision-makers in defining resilient pathways to adapt to climate change and extreme weather events. Despite advances in scientific data and knowledge (e.g. Copernicus, GEOSS), current CS fail to achieve their full value proposition to end users. Challenges include incorporation of social and behavioral factors, local needs, knowledge and the customs of end users. In I-CISK, we put forward a co-design based requirement analysis to develop a Spatial Data Infrastructure and Platform that empowers a next generation of end user CS, which follow a social and behaviorally informed approach to co-producing services that meet climate information needs of the Living Labs of the European I-CISK project. Core to the project are climate extremes such as droughts, floods and heatwaves. The use-cases touch upon agriculture, forestry, tourism, energy, health, and the humanitarian sectors. We will present the summarized stakeholders' requirements regarding the new climate-service platform and their technical implications for the open source spatial infrastructure. The design also includes assessing, managing and presenting uncertainties that are an inherent component of climate models.

How to cite: Gräler, B., Demmich, K., Schnell, J., Vogel, M., Bagli, S., and Mazzoli, P.: Building an Open Source Infrastructure for Next Generation End User Climate Services, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15863, https://doi.org/10.5194/egusphere-egu23-15863, 2023.

EGU23-16416 | Posters virtual | GI2.3

The set up of the “UNO” project relational database for Stromboli volcano

Simone Tarquini, Francesco Martinelli, Marina Bisson, Emanuela De Beni, Claudia Spinetti, and Gabriele Tarabusi

Active volcanoes are complex, poorly predictable systems that can pose a threat to humans and their infrastructures. As such, it is important to improve as much as possible the understanding of their behavior. The Stromboli volcano, in Italy, is one of the most active volcanoes in the world, and its almost persistent activity is documented since centuries. The persistent background activity is sometimes interrupted by much more energetic, dangerous episodes. The Istituto Nazionale di Geofisica e Vulcanologia (Italy) set up the interdisciplinary “UNO” project, aimed to understand when the Stromboli volcano is about to switch from the ordinary to the extraordinary activity. The UNO project includes an outstanding variety of research activities, such as sampling in the field, the modeling of Stromboli topography from ALS technique and satellite data, the 3D numerical simulations of ballistic trajectories, or the set up of an ultrasonic microphones system. Key to the success of the project is the collection of integrated high spatial and temporal resolution data and their joint analyses in a shared relational database. We present here the simplified logical model of such database, focusing on the identification of entities and their relationships.

How to cite: Tarquini, S., Martinelli, F., Bisson, M., De Beni, E., Spinetti, C., and Tarabusi, G.: The set up of the “UNO” project relational database for Stromboli volcano, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16416, https://doi.org/10.5194/egusphere-egu23-16416, 2023.

EGU23-16605 | ECS | Orals | GI2.3

NASA’s Science Discovery Engine: An Interdisciplinary, Open Science Data and Information Discovery Service

Kaylin Bugbee, Ashish Acharya, Carson Davis, Emily Foshee, Rahul Ramachandran, Xiang Li, and Muthukumaran Ramasubramanian

NASA’s Science Plan includes a strategy to advance discovery by leveraging cross-disciplinary opportunities between scientific disciplines. In addition, NASA is committed to building an inclusive, open science community over the next decade and is championing the new Open-Source Science Initiative (OSSI) to foster that community. The OSSI supports many activities to promote open science including the development of an empowering cyberinfrastructure to accelerate the time to actionable science. One component of the OSSI cyberinfrastructure is the Science Discovery Engine (SDE). The goal of the SDE is to enable the discovery of data, software and documentation across the five SMD divisions including Astrophysics, Biological and Physical Sciences, Earth Science, Heliophysics and Planetary Science. The SDE increases accessibility to NASA’s open science data and information and promotes interdisciplinary scientific discovery. In this presentation, we describe our work to develop the Science Discovery Engine in Sinequa, a Cognitive Search capability. We also share lessons learned about data governance, curation and information access.

How to cite: Bugbee, K., Acharya, A., Davis, C., Foshee, E., Ramachandran, R., Li, X., and Ramasubramanian, M.: NASA’s Science Discovery Engine: An Interdisciplinary, Open Science Data and Information Discovery Service, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16605, https://doi.org/10.5194/egusphere-egu23-16605, 2023.

EGU23-5527 | PICO | HS6.9

Assessing lake and reservoir storage change from remote sensing data at a global scale

Christophe Fatras, Alice Andral, and Jérémy Augot

Lakes and reservoirs monitoring is of sheer interest, as in-situ gauging station coverage is dwindling at a global scale. Water storage change show the impact of not only domestic consumption, low water maintenance in rivers or crop irrigation, but also the impact of climate change. In this frame, different approaches are explored in this study to be able to follow from space remote sensing data the lake storage change, either from water height or surface.

For the ESA CCI Lake Storage Change option, we focus on a few lakes distributed around the world to establish a methodology suitable at a global scale to different lake behaviors. In particular for highly varying water bodies, the automatic production and use of hypsometric curve is investigated This approach is yet suitable for volume variations only.

Complementary to this first approach, an image inpainting algorithm applied to digital elevation models around water bodies is developped to assess their total bathymetry (either for lake or reservoir) where the pixels to be reconstructed are the ones underneath the lake surface. The first results show encouraging estimations that may lead in the near future to the assessment of total water volume of lake and reservoir at a global scale. With the recent launch of SWOT that will provide an unprecedented coverage worldwide, the estimation of global water storage change has a bright future.

How to cite: Fatras, C., Andral, A., and Augot, J.: Assessing lake and reservoir storage change from remote sensing data at a global scale, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5527, https://doi.org/10.5194/egusphere-egu23-5527, 2023.

EGU23-7453 | ECS | PICO | HS6.9

Extract an accurate river network

Qiuyang Chen, Simon Mudd, Mikael Attal, and Steven Hancock

EGU23-7793 | ECS | PICO | HS6.9

hydroweb.next, an open-data WebGIS platform to bring state-of-the-art products derived from satellite remote sensing to hydrology users

Nicolas Gasnier, Lionel Zawadzki, Flavien Gouillon, Bernard Specht, Pascal Cauquil, Santiago Pena Luque, Aurore Dupuis, Vincent Martin, Aurélie Sand, Thérese Barroso, Nicolas Picot, Aurélie Strzepek, and Philippe Maisongrande

The hydroweb.next platform is an open-data thematic hub for hydrology. It aims to foster new uses of remote sensing data for water applications by removing the main barriers: data formatting issues, dispersal of access points, and data processing costs,…

Hydroweb.next has been funded by the French government in the frame of Theia (Data and Services center for continental surfaces) and SWOT downstream (Surface Water and Ocean Topography satellite) programs. The hub brings together products from various providers such as Copernicus Land Services along with products from its own production centers. The production centers operate state-of-the-art algorithms that have been developed with scientists from Theia’s Scientific Expertise Centers: SurfWater for Surface Water Extent (SWE) from Sentinel-1 and Sentinel-2 images, Let It Snow for fractional snow cover and OBS2CO for water quality from Sentinel-2 images. As of June 2023, these 3 products will be made available with a 5 million square kilometer coverage. Products from SWOT and Trishna missions will also be distributed by hydroweb.next as they become available.
In late 2023, SWOT data will include high-level user-oriented products such as river discharges and lake storage changes with global coverage. In 2025, Trishna products will include water quality, water skin temperature, and evapotranspiration. In situ data are also available to allow comparison with satellite data.

The products are distributed using STAC (Spatio-Temporal Asset Catalog) and WMS/WMTS (Web Mapping Services) protocols that follow the FAIR principles. This enables the direct reuse of the data by other services (e.g. UNESCO’s water quality portals).

The WebGIS interface is designed following a User-Centered Development approach. By involving users from various backgrounds such as Water Agencies, NGOs, industry, or academic research in stages of the project: surveys of user needs during interviews, features design involving users, ergonomics improvement through alpha testing, and quick consideration of user feedbacks through continuous integration and deployment. The interface allows searching relevant data using keywords, geophysical variables, and space-time restrictions. It also allows visualizing the products, their temporal evolution, and multitemporal synthesis. Finally, it allows downloading, harvesting, or streaming data, either through the interface or python APIs.

How to cite: Gasnier, N., Zawadzki, L., Gouillon, F., Specht, B., Cauquil, P., Pena Luque, S., Dupuis, A., Martin, V., Sand, A., Barroso, T., Picot, N., Strzepek, A., and Maisongrande, P.: hydroweb.next, an open-data WebGIS platform to bring state-of-the-art products derived from satellite remote sensing to hydrology users, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7793, https://doi.org/10.5194/egusphere-egu23-7793, 2023.

EGU23-8586 | PICO | HS6.9

Characterization of Copernicus EO products for water modelling

Ester Prat, Lluís Pesquer, Amanda Batlle, Evangelos Spyrakos, and Silvy Thant

This work presents the characterization and performance evaluation of the existing Copernicus EO products for the monitoring and modelling of water bodies dynamics purposes. This study is carried on the Water-ForCE project (https://waterforce.eu/). Water-ForCE (“Water scenarios For Copernicus Exploitation”) is an European H2020 project to develop a Roadmap for Copernicus Inland Water Services, aiming to better integrate the entire inland water cycle within the Copernicus Services.

From the requirements collected in the project through dedicated working group meetings and stakeholders’ consultation, a list of water quality and water quantity variables which are used in water modelling were identified. All of them were related to five types of modelling into water management: biogeochemical models, hydrodynamic models, river models, crop or pasture growth models and landscape water balance models. The availability of water related products in the Copernicus portfolio was analysed by checking and updating the previous water quantity and water quality project inventories.

This work studies their spatial coverage, data discovery and access, file data formats, validation reports, uncertainty indicators and spatial and temporal resolutions and their utility in the mentioned types of water modelling was analysed. Finally, recommendations on improvements of the existing Copernicus products were made based on the cross analysis between the existing features and user needs.

Main conclusions of the work point to the lack of bathymetry and evapotranspiration as well as some specific water quality products, the need of finer spatial resolution for chlorophyll-a in coastal zones and for soil moisture, surface water and snowmelt products, higher temporal resolution for river discharge and groundwater and water quality variables, the need of validation, increase the coherence between in-situ and remote sensing observations and to provide more quality and uncertainty information. Other demands are to uniform marine and lake products, a global (or at least Pan European) coverage for some local/regional products, improvements in data access and data delivery of new formats, and continuous and consistent long-term archives of vegetation and land cover products.

How to cite: Prat, E., Pesquer, L., Batlle, A., Spyrakos, E., and Thant, S.: Characterization of Copernicus EO products for water modelling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8586, https://doi.org/10.5194/egusphere-egu23-8586, 2023.

EGU23-9207 | ECS | PICO | HS6.9

Lessons From Navajo Nation Water Resources, Utilizing Earth Observations to Monitor Drought

Rene Castillo

Emergency planning is the act of preparing for emergencies to reduce losses, human and environmental. The planning process is never complete, threats change and the tools we utilize to address them advance. Drought occurrences within North America and the magnitude of drought impacts reveal the persistent vulnerability of the United States to drought, specifically in the indigenous community. Until recently, drought management was largely response oriented, with little to or no attention to mitigation and preparedness. In 2002, the Navajo Nation developed a drought contingency plan but within the past 20 years no adaptation has occurred. With the increase in adverse impacts of climate change in recent years, an emergent need to revise the drought plan to place more emphasis on mitigation has been expressed by the Water Management Branch of the Navajo Nation. Quantification of the main components of drought mitigation and planning include the assessment of who and what is vulnerable and why. Historically, drought mitigation efforts were restricted by data availability, financial capabilities, and data acquisition. The current contingency plan utilizes the Standardized Precipitation Index (SPI) on a 6-month time scale alone. Yet current research shows that drought is a complex natural hazard where no singular index can adequately capture the impacts across the main categories of drought. As the definition of drought is variable across place, time, and discipline, the addition of diverse indices could provide more insight for the development of a further detailed and tailored drought contingency plan. As such, it has been found that assessing all categories of drought could improve the Navajo Nation’s drought contingency plan by exposing new concepts not yet considered in mitigation efforts. Adding to the currently utilized index that is based off the sole parameter of precipitation, evaluated here is how temperature, humidity, snow cover, vegetation health, and stream flow. These additional factors are able to compare the meteorological drought vulnerability and severity assessment with the Standardized Precipitation Index (SPI) through the development of a web app to display multivariate indices. Hydrological drought should best match meteorological drought both spatially and temporally with agricultural and socioeconomical drought varying the most from the Standardized Precipitation Index (SPI) used for the Navajo Nation Drought Contingency Plan (2002). By applying diverse indices and social data to the Navajo Nation and developing maps across Google Earth Engine and GIS platforms, gaps in risk and vulnerability assessments can be addressed for preparation and mitigation efforts. Predictors for differing categories (hydrological, agricultural, and socioeconomic) may not predict the same important indicator(s) as the meteorological SPI, further establishing a need for multi-index integration and future drought research. This study identifies a methodology for remote and spatial, GIS-based assessment of drought indexing and vulnerability assessment across the Navajo Nation to address broader water management. The identification of insightful drought indices and drought vulnerability is an essential step in addressing the risks and vulnerabilities across the Navajo Nation and may lead to better informed mitigation-oriented drought management for tribal governments, both Navajo and within North America.

How to cite: Castillo, R.: Lessons From Navajo Nation Water Resources, Utilizing Earth Observations to Monitor Drought, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9207, https://doi.org/10.5194/egusphere-egu23-9207, 2023.

EGU23-9469 | ECS | PICO | HS6.9

Toward a Smart Tool for Irrigation Systems management Using remote sensing

Halima Taia, Edyta Wozniak, Abdes Samed Bernoussi, and Mina Amharref

In agriculture, water and fertilizer are two limiting elements of plant growth. Indeed, the lack or the excess of one of them disturbs the yields in terms of quality and quantity. Optimal irrigation/fertilization and precisely dosed nutrient supply allow fast growing plants to reach their full potential, offering much larger and better quality yields. The use of remote sensing through satellites images becomes necessary in the case of a large area. To manage properly the use of water and fertilizers in a region it is necessary to know the spatial distribution of crops. So first, we have to discriminate crops. Next the control of doses and plant growth rate must be performed.

In this paper we present a tool for smart management of the water irrigation and fertilizer using remote sensing data and mathematical algorithms by considering crops as a dynamical system.

We give some mathematical algorithms to discriminate dynamical systems (crops) and after we consider the problem of detection of the impact of irrigation and fertilization on the crop through spectral signatures. For this, we consider the problem of detecting the effects of nitrogen and irrigation on the mint by spectroscopy and we compare the obtained results with other obtained measures for rosemary without fertilizer. For our case study, we choose potted mint as a plant that grows very fast and we apply our spectral measurement protocol to answer the following problem: Can we detect the effect of water and nitrogen by observing the growth of a given crop using the spectrometer? The results will be used for our tool to manage irrigation and fertilizer.

Keywords: irrigation, fertilizer, crop growth, remote sensing, dynamical systems

This work is the result of a research project: Alkhawarizmi/2020/11: Tool for intelligent management of irrigation water and forest heritage, funded by MESRSFC, CNRST and ADD, Morocco

How to cite: Taia, H., Wozniak, E., Bernoussi, A. S., and Amharref, M.: Toward a Smart Tool for Irrigation Systems management Using remote sensing, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9469, https://doi.org/10.5194/egusphere-egu23-9469, 2023.

EGU23-10597 | ECS | PICO | HS6.9

Developing a physics-guided neural network to predict soil moisture with remote sensing evapotranspiration and weather forecasting

Zonghan Ma, Bingfang Wu, Sheng Chang, Nana Yan, and Weiwei Zhu

The short-term prediction of soil moisture variation is a decisive indicator of irrigation scheduling and crop management in agriculture. Traditional soil water dynamic models require complex descriptions of water movement and multiple parameters to calibrate for specific fields, which limit the model’s capability of generalization. Machine learning methods based on large sample datasets can automatically learn the most accurate way of predicting soil moisture with numerous related input variables. However, it could be time consuming in training and model optimization to improve performance. Combining the advantages of both methods, we designed a new soil moisture prediction neural network guided by the water transport driving mechanism. The water balance principle is used to limit the training process with remote sensing-based field-scale evapotranspiration, meteorological rainfall and primary soil water changes calculated from a simplified soil water model. By adding the physics layer to neural network, the demand for large datasets and the requirements of training and optimization are reduced. The prediction of soil moisture is at a half-monthly scale, and we tested the model during the winter wheat growing period. The results show that it requires less training capability to achieve high accuracy. Physics-guided neural networks could act as a better framework for parameter prediction in further researches.

How to cite: Ma, Z., Wu, B., Chang, S., Yan, N., and Zhu, W.: Developing a physics-guided neural network to predict soil moisture with remote sensing evapotranspiration and weather forecasting, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10597, https://doi.org/10.5194/egusphere-egu23-10597, 2023.

EGU23-12090 | PICO | HS6.9

Stockwater - Advances in reservoir stock monitoring from space

Santiago Peña Luque, Gael Nicolas, Herve Yesou, Thomas Ledauphin, Sabrine Amzil, Jerome Maxant, Sylvain Ferrant, Manuela Grippa, Afredo Ribeiro, Jean Stéphane Bailly, Jerome Molenat, Fabien Puech, and Rafael Reis

Dams are strategic tools for countries and their management of water resources. Within the Space for Climate Observatory (SCO) initiative and SWOT Downstream program, the StockWater project aims to put in place a system for monitoring water volume in dams. It is based on satellite data, and a specific processing system, thereby facilitating the work of the public authorities in this area.

Water resources monitoring, including surface and groundwater, is a vital issue for governments and public institutions. Water resources are essential for society and economic activity (drinking water, irrigation, hydroelectricity, industry, flood control) and for natural and water ecosystems.

Generally, reservoir stock information is collected and held by the local reservoir managers (public or private). Regional and national authorities might access this information with a certain latency, which depends on national water policies. Central authorities are then confronted with two issues: long latencies to retrieve water stock information and sparse or inexistent information about small reservoirs.

The project proposes a global solution to monitor reservoir stock volumes based on frequent satellite measurements. This solution is based on reservoir water extent monitoring by imaging satellites (Sentinel 1&2) based in the Surfwater processing chain, which integrates a multitemporal approach to improve water masks. Furthermore, StockWater innovation relies on reservoir estimation of Area/Elevation/Volume relationships just from a DEM, even when acquired after the reservoir construction.

Recent total volume estimations from DEM estimations have been qualified on hundreds reservoirs in France, ranging from 10 to 10000 hectares, providing errors lower than 20% on 77% of the reservoirs. About the general system assesment, Filling rates estimates yield an error lower than 8% on 75% of the measurements.

New versions are evaluated on Spain, France, India and Brazil and deployed on Burkina-Faso and Tunisia over more than 100 reservoirs. Results are available here: https://www.sco-stockwater.org/ This system will also easily allow volume estimations from Elevation measurements (altimeters Jason, Sentinel3 with limited coverage or SWOT globally).

StockWater project, led by CNES and developed with CS-Group and SERTIT, holds a partnership initiative with CESBIO, GET, LISAH and FUNCEME/Université Pernambuco laboratories and their local partners in Tunisia, Laos, Burkina-Faso, Brazil and India. StockWater is open to new countries willing to participate in future project expansions.

How to cite: Peña Luque, S., Nicolas, G., Yesou, H., Ledauphin, T., Amzil, S., Maxant, J., Ferrant, S., Grippa, M., Ribeiro, A., Bailly, J. S., Molenat, J., Puech, F., and Reis, R.: Stockwater - Advances in reservoir stock monitoring from space, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12090, https://doi.org/10.5194/egusphere-egu23-12090, 2023.

EGU23-12217 | ECS | PICO | HS6.9

Assessing the impacts of flash drought on terrestrial ecosystem based on satellite data

minsun Kang and minha Choi

Drought within a short time, termed flash drought (FD), severely affects terrestrial ecosystems and water resources. Water use efficiency (WUE) is an essential parameter in understanding the relationship between the water and carbon cycles. However, little is known about the response of WUE to FDs in the Korea peninsula. Therefore, this study identified FD events in Korea using the evapotranspiration (ET) based Standardized Evaporative Stress Index (SESI) and rate of intensification (RI) using soil moisture at flux tower site. Results showed that FD events detected similar patterns in both SESI and RI. At the regional scale, we identified Korea FD frequency and duration via anomalies in SESI using MODerate Resolution Imaging Spectroradiometer (MODIS). Results showed that Korea suffered from 61.3% of FD events for 20 years. The regions with the most FD events were primarily found within the north and east, where the main landcover type is forest, and long FD events (over 30 days) were detected in the northeastern study region. In addition, the effects of FD events on WUE were different based on FD magnitude and landcover types. The changes in WUE response to moderate FD events were obviously driven by the GPP, and the WUE in cropland was observed the highest sensitivity to FD magnitude. To analyze FD impacts on cropland in detail, we focus on monitoring the crop response to FD using microwave remote sensing data such as Synthetic Aperture Radar (SAR) which will be helpful to detect FD effects on crops in a higher resolution.

How to cite: Kang, M. and Choi, M.: Assessing the impacts of flash drought on terrestrial ecosystem based on satellite data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12217, https://doi.org/10.5194/egusphere-egu23-12217, 2023.

EGU23-14319 | ECS | PICO | HS6.9

Improvements on the monthly precipitation spatial pattern characterization using a set of remote sensing products.

Amanda Batlle, Lluís Pesquer, Cristina Domingo-Marimon, Nuria Hernández-Mora, Nikoletta Ropero, Ester Prat, Annelies Broekman, Lucia De Stefano, Miquel Ninyerola, and Micha Werner Werner

Water availability is a limiting factor for many human activities and natural ecosystems processes. Monitoring of water resources, as well as the impacts of water scarcity on human and natural ecosystems, is key for defining adapted water management strategies. Currently, different European and Worldwide organisations are providing several climate services (CS) based on output datasets from weather forecast and climate projection models. To ensure the translation of these CS to actionable knowledge at a local scale, it has been required the tailoring and downscaling of data to fit the user requirements expressed by selected stakeholders representing different relevant sectors. This is one of the main goals of H2020 I-CISK project (https://icisk.eu/) which includes this study carried out at the Guadalquivir River Basin (RB) (small part of Guadiana RB), in the northern part of Andalusia, South of Spain. It is one of the seven established living labs (LL) in I-CISK. This LL is particularly vulnerable to drought impacts.

The present work aims at evaluating the contribution of remote sensing data as an explanatory variable of the spatial pattern of precipitation, a key meteorological variable of water resources models. This characterization is a necessary preliminary step to understand the local relationships between climatic variables and others (topography, vegetation response, etc…) in order to subsequently apply known correlations to downscale weather forecasting and climate projection models to the spatial resolution required by the user community.

The method is based on the generation of multiple regressions with residual interpolation using weather stations' monthly precipitation data as the dependent variable and a set of independent variables at 250 m spatial resolution such as, squared distance to the Mediterranean Sea and to the Atlantic Ocean, elevation, cosine of aspect, a set of remote sensing indexes (Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI)), synthetic versions of these indexes and corresponding anomalies.

The NDVI used is generated by monthly aggregation of 16‐day MODIS composite products of MOD13Q1. NDWI has also been calculated from MOD13Q1 surface reflectance products. Synthetic NDVI and NDWI have been generated replacing the original pixel values by the neighbouring vegetation NDVI at the locations of gauge stations where land cover is categorized as impervious surface. NDVI and NDWI anomalies are calculated based on the climatological monthly mean from the 2000-2021 MODIS data time series. Regressions include independent variables time lags of 0, +1, +2 and +3 months after with respect to the date of precipitation variable.

Preliminary results of single year’s analysis show that including remote sensing data to the analysis results in a better spatial characterization, obtaining higher correlations in the regressions, which are strongly dependent on seasonality. There is no clear pattern of which index (version and anomaly) is the best contributor and there is also no clear result for the response time lag between precipitation and the indices, although +2 months seems to be the most relevant. Future work will use a full time series analysis to obtain more information on these patterns.

How to cite: Batlle, A., Pesquer, L., Domingo-Marimon, C., Hernández-Mora, N., Ropero, N., Prat, E., Broekman, A., De Stefano, L., Ninyerola, M., and Werner, M. W.: Improvements on the monthly precipitation spatial pattern characterization using a set of remote sensing products., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14319, https://doi.org/10.5194/egusphere-egu23-14319, 2023.

EGU23-15535 | PICO | HS6.9

Satellite remote sensing based approach for water quality monitoring in a data sparse region

Bakimchandra Oinam, Vicky Anand, Rajkumari Neetu Sana, and Silke Wieprecht

The application of remote sensing can aid the decision makers and the researchers in the field of water resources for the effective monitoring of water quality in a water sparse region. The monitoring of water quality in a wetland dominated by the heterogeneous biomass becomes more intricate. This research study was carried out in Loktak Lake, a Ramsar site nestled in the Indo-Myanmar range between the time intervals February 2022 to December 2022. In order to carry out this study, high and very high resolution multispectral satellite imageries were used. The physical water quality parameters namely electrical conductivity, total suspended solids, pH, turbidity, and nitrates were considered for the assessment. The results of this study clearly indicate a strong correlation between the field-measured parameters and reflectance. The prediction algorithms were generally the best fit to derive the water quality parameters. The model performance indices indicates good performance of the model with correlation coefficient greater than 0.80. The outcomes of this study emphasize the use of high and very high multi-spectral satellite imageries for the monitoring of water bodies with complex dynamics.

How to cite: Oinam, B., Anand, V., Sana, R. N., and Wieprecht, S.: Satellite remote sensing based approach for water quality monitoring in a data sparse region, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15535, https://doi.org/10.5194/egusphere-egu23-15535, 2023.

ESSI4 – Advanced Technologies and Informatics Enabling Transdisciplinary Science

EGU23-2379 | PICO | ESSI4.1

Visualizing and communicating probabilistic flood forecasts maps for decision-making

Marie-Amélie Boucher, Valérie Jean, Anissa Frini, and Dominic Roussel

Probabilistic flood forecasts often concentrate on streamflow, but water depth and extent might convey more tangible flood information for some people. Water depths and extent can also be used more directly than streamflow as part of an impact-based forecasting set-up. However, within a probabilistic or ensemble approach, the uncertainty inherent to water extent and depth applies to all three spatial dimensions: the depth itself is uncertain, and so is the extent in terms of latitude and longitude. The notion of forecast uncertainty is generally well accepted by users, and on the one hand, the addition of new information (flood extent, depth, velocity, etc.) has the potential to be useful for decision makers. On the other hand, it also has the potential to be overwhelming and confusing. Therefore, visualising probabilistic flood forecast maps and communicating the information to the general public and to decision-makers poses multiple challenges. In this presentation we will synthesise the results from a large-scale survey of forecast users, including 28 government representatives, 52 municipalities, 9 organisations, as well as 37 citizens and farmers. Those different groups have different roles, realities, and perspectives. They also have different needs and preferences in terms of hydrological forecasts. The survey consisted of individual and group interviews. The participants were asked a variety of open questions regarding their needs and preferences for hydrological forecasts and also for the visualisation and the communication of those forecasts. One key element of the interviews was the presentation of four alternative visualisation prototypes for probabilistic forecasts of flood depth and extent. The participants were asked to compare those prototypes, to express their preferences in terms of colour maps, wording and the representation of uncertainty. They also provided useful comments on potential modifications to those prototypes and sometimes suggested ideas for entirely new prototypes. Our results highlight that most participants, regardless of their role or background, had the same overall preference in terms of the proposed prototypes, with prototype number 2 the overall favorite (all prototypes will be shown and explained during the presentation). Nevertheless, we also found several specificities among the respective preferences of different user groups. Our results also highlight specific issues related to the understanding of probabilities in the context of flood forecast maps. The results of this research are currently being used to inform the design of the new forecast communication and visualisation platform in the province of Quebec, Canada.

How to cite: Boucher, M.-A., Jean, V., Frini, A., and Roussel, D.: Visualizing and communicating probabilistic flood forecasts maps for decision-making, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2379, https://doi.org/10.5194/egusphere-egu23-2379, 2023.

EGU23-3624 | ECS | PICO | ESSI4.1

ESM Data Exploration with the Model Data Explorer

Philipp S. Sommer, Linda Baldewein, Hatef Takyar, Rehan Chaudhary, Mostafa Hadizadeh, Housam Dibeh, Max Böcke, Christof Lorenz, Tilman Dinter, Stefan Pinkernell, Klaus Getzlaff, and Ulrike Kleeberg

Making Earth-System-Model (ESM) Data accessible is challenging due to the large amount of data that we are facing in this realm. The upload is time-consuming, expensive, technically complex, and every institution has their own procedures.

Non-ESM experts face a lot of problems and pure data portals are hardly usable for inter- and trans-disciplinary communication of ESM data and findings, as this level of accessibility often requires specialized web or computing services.

With the Model Data Explorer, we want to simplify the generation of web services from ESM data, and we provide a framework that allows us to make the raw model data accessible to non-ESM experts.

Our decentralized framework implements the possibility for an efficient remote processing of distributed ESM data. Users interface with an intuitive map-based front-end to compute spatial or temporal aggregations, or select regions to download the data. The data generators (i.e. the scientist with access to the raw data) use a light-weight and secure python library based on the Data Analytics Software Framework (DASF, https://digital-earth.pages.geomar.de/dasf/dasf-messaging-python) to create a back-end module. This back-end module runs close to the data, e.g. on the HPC-resource where the data is stored. Upon request, the module generates and provides the required data for the users in the web front-end.

Our approach is intended for scientists and scientific usage! We aim for a framework where web-based communication of model-driven data science can be maintained by the scientific community. The Model Data Explorer ensures fair reward for the scientific work and adherence to the FAIR principles without too much overhead and loss in scientific accuracy.

The Model Data Explorer is in the progress of development at the Helmholtz-Zentrum Hereon, together with multiple scientific and data management partners in other German research centers. The full list of contributors is constantly updated and can be accessed at https://model-data-explorer.readthedocs.io.

How to cite: Sommer, P. S., Baldewein, L., Takyar, H., Chaudhary, R., Hadizadeh, M., Dibeh, H., Böcke, M., Lorenz, C., Dinter, T., Pinkernell, S., Getzlaff, K., and Kleeberg, U.: ESM Data Exploration with the Model Data Explorer, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3624, https://doi.org/10.5194/egusphere-egu23-3624, 2023.

EGU23-3652 | ECS | PICO | ESSI4.1

A new set of tools to explore, analyze, and communicate animal movements with environmental and anthropogenic context

Justine Missik, Gil Bohrer, Madeline Scyphers, Sarah Davidson, Roland Kays, Nilanjan Chatterjee, Allicia Kelly, Ashley Lohr, Andrea Kölzsch, Martin Wikelski, and John Fieberg

The Yellowstone to Yukon Conservation Corridor (Y2Y) is North America's largest nature corridor and connectivity project for wildlife. The 2,000-mile swath of land between Wyoming, USA and the Yukon Territory of Canada is one of the last remaining intact mountain ecosystems on Earth, and home to many endangered and at-risk species. The Y2Y is a mosaic of protected and unprotected land including Canadian and US national/state/provincial/territory parks, federally/state managed wildland and national forests, Indigenous territories, and privately managed conservation easements. We are developing a collaborative animal-movement archive for the Y2Y and research tools to study and communicate the effectiveness of protected areas, drivers of migration, and movement connectivity. These tools are applied by end users throughout the Y2Y to support decision making and land and wildlife management.

Our Movebank-based archive of in situ animal location observations provides a uniform data format and QA protocol for conducting large-scale, long-term, and multi-species analyses in support of wildlife management efforts in the region. These data will contribute to biodiversity assessments related to climate and other regional and global changes, and provide a baseline against which to detect early signals of local or large-scale ecosystem changes. We have developed an array of interactive tools for preparing and analyzing movement data using the MoveApps platform, a GUI-based App-development environment for data processing and analysis tools. These tools facilitate the integration of contextual environmental data from remote sensing and weather data products, and additional local environmental data layers. We have developed Apps to detect and quantify events of interest, particularly road crossings, parturition events and kill clusters, and are developing additional Apps to conduct resource and step-selection analyses using data from multiple studies at varying resolutions. To facilitate data exploration and data-based outreach and communication, we have developed ECODATA – a set of data preparation and visualization software packages in MATLAB and Python for building custom animated maps of animal movements along with contextual land management and environmental data layers.

MoveApps and ECODATA are general tools that can be applied to any animal movement dataset. Initial research questions and applications, catered to the decision-making needs of our end users in the Y2Y project, include: How are protected lands utilized by mammals throughout the Y2Y? How is connectivity between conservation areas influenced by current and predicted future environmental characteristics and anthropogenic disturbances (roads in particular)? Continuous joint development and application of tools with active collaboration with our end users guarantee that the research tools we develop answer the management and research needs of end users, while answering new and exciting questions about environmental drivers of movement in the Y2Y.

How to cite: Missik, J., Bohrer, G., Scyphers, M., Davidson, S., Kays, R., Chatterjee, N., Kelly, A., Lohr, A., Kölzsch, A., Wikelski, M., and Fieberg, J.: A new set of tools to explore, analyze, and communicate animal movements with environmental and anthropogenic context, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3652, https://doi.org/10.5194/egusphere-egu23-3652, 2023.

EGU23-8381 | ECS | PICO | ESSI4.1

Construction of a Fluvial Facies Knowledge Graph and Its Application in Sedimentary Facies Identification

Lei Zhang, Mingcai Hou, Anqing Chen, Hanting Zhang, Ogg James, and Dongyu Zheng

Lithofacies paleogeography is a data-intensive discipline that involves the interpretation and compilation of sedimentary facies. Traditional sedimentary facies analysis is a labor-intensive task with the added complexity of using unstructured knowledge and unstandardized terminology. Therefore, it is very difficult for beginners or non-geology scholars who lack a systematic knowledge and experience in sedimentary facies analysis. These hurdles could be partly alleviated by having a standardized, structured and systematic knowledge base coupled with an efficient automatic machine-assisted sedimentary facies identification system. To this end, this study constructed a knowledge system for fluvial facies and carried out knowledge representation. Components include a domain knowledge graph for types of fluvial facies (meandering, braided and other river depositional environments) and their characteristic features (bedforms, grain-size distribution, etc.) with visualization, a method for query and retrieval on a graph database platform, a hierarchical knowledge tree-structure, a data-mining clustering algorithm for machine-analysis of publication texts, and an algorithm model for this area of sedimentary facies reasoning. The underlying sedimentary facies identification and knowledge reasoning system is based on expert experience and synthesis of publications. For testing, 17 sets of literature publications data that included details of sedimentary facies data (bedforms, grain sizes, etc.) were submitted to the AI model, then compared and validated. This testing set of automated reasoning results yielded an interpretation accuracy of about 90% relative to the published interpretations in those papers. Therefore, the model and algorithm provide an efficient and automated reasoning technology, which provides a new approach and route for the rapid and intelligent identification of other types of sedimentary facies from literature data and direct use in the field.

How to cite: Zhang, L., Hou, M., Chen, A., Zhang, H., James, O., and Zheng, D.: Construction of a Fluvial Facies Knowledge Graph and Its Application in Sedimentary Facies Identification, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8381, https://doi.org/10.5194/egusphere-egu23-8381, 2023.

EGU23-9258 | ECS | PICO | ESSI4.1

Lexcube: An Interactive Earth Science Data Cube Visualization

Maximilian Söchting, Miguel D. Mahecha, David Montero Loaiza, and Gerik Scheuermann

A variety of Earth system data streams are being captured and derived from remote sensing observations and modelling approaches. Since the spatial and temporal resolutions of these datasets continuously rise, global and local insights become more difficult to obtain and only specialists are able to effectively access and explore the data.

Here we present the Leipzig Explorer of Earth Data Cubes (lexcube.org), the first fully interactive viewer for large Earth system data cubes, enabling the exploration and visualization of terabytes of data through space and time. Lexcube runs in the web browser and on many modern devices, including phones and tablets, works with a weak network connection and requires no coding skills. It can also be used to support field research by displaying the current geolocation of the user device in the visualization, allowing to compare past Earth system data to the current real-world situation in the field.

Currently, lexcube.org allows to explore the Earth System Data Cube with 73 parameters from various domains, the ECMWF CAMS global reanalysis of atmospheric composition EAC4 and a data set of 97 different spectral indices from the national park Hainich in Germany. As of January 2023, lexcube.org has seen over 2,500 users who have generated over 145,000 API requests since its release in May 2022. Utilizing the open-source library xarray, Lexcube is capable of browsing any supported gridded data set in space and time, integrating into the existing data cube open-source ecosystem. Lexcube itself will be released in 2023 as an accessible, easy-to-use open-source package.

How to cite: Söchting, M., Mahecha, M. D., Montero Loaiza, D., and Scheuermann, G.: Lexcube: An Interactive Earth Science Data Cube Visualization, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9258, https://doi.org/10.5194/egusphere-egu23-9258, 2023.

EGU23-9952 | PICO | ESSI4.1 | Highlight

The Earth Data Portal for Finding and Exploring Research Content

Robin Heß, Karen Albers, Peter Konopatzky, Roland Koppe, and Andreas Walter

Digitization and the Internet in particular have created new ways to find, re-use, and process scientific research data. Many scientists and research centers want to make their data available to the public and other researchers, but often the data is still not easy to find because it is distributed across different infrastructures. Rights of use and citability are sometimes unclear, and access to the data may have to be requested manually from the persons in charge.

The Earth Data Portal aims to provide a single point of entry for discovery and re-use of scientific research data in compliance with the FAIR principles. The portal aggregates data of the earth and environment research area from various providers and improves its findability. We also encourage publishing with permanent identifiers so that data is citable according to good scientific practice. As part of the German Marine Research Alliance and the Helmholtz-funded DataHub project, leading German research centers are working on joint data management concepts, including the data portal.

The portal offers a modern web interface with a full-text search, facets and explorative visualization tools. Seamless integration into the Observation to Analysis and Archives Framework (O2A) developed by the Alfred Wegener Institute also enables automated data flows from data collection to publication in the PANGAEA data repository and visibility in the portal. Current metadata on research missions and platforms also finds its way into the portal.

Logged in users get access to a common workspace that enables data processing on a shared infrastructure. This includes access to a shared file system, a Linux shell and a JupyterHub. The common workspace is strongly integrated into the automated data flow and enables access to automatically ingested data.

Another important part of the project is a comprehensive framework for data visualization, which brings user-customizable map viewers into the portal. Pre-curated viewers currently enable the visualization and exploration of data products from maritime research. The login feature also empowers users to create their own viewers including OGC services-based data products from different sources.

In the development of the portal, we use state of the art web technologies to offer user-friendly and high-performance tools for scientists. Regular demonstrations, feedback loops and usability workshops ensure implementation with added value.

How to cite: Heß, R., Albers, K., Konopatzky, P., Koppe, R., and Walter, A.: The Earth Data Portal for Finding and Exploring Research Content, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9952, https://doi.org/10.5194/egusphere-egu23-9952, 2023.

EGU23-11191 | PICO | ESSI4.1

Making complex climate information available for a stakeholder dialogue: the Climate Monitor for Northern Germany

Markus Benninghoff, Philipp S. Sommer, Linda Baldewein, and Insa Meinke

Climate research in Northern Germany provides important information to enable adaption to climate change. However, the increasing complexity and the amount of data that needs to be processed makes the information inaccessible for external parties outside of the climate modeling community. Since 2007 the Coastal and Climate Office for Northern Germany at Helmholtz-Zentrum Hereon has maintained a long term stakeholder dialogue. In this context, we make knowledge on coastal climate research available to the public, and to decision-makers. Our range of stakeholders consists of adjacent scientific research groups, interested individuals, governmental bodies, non-governmental organizations, media, education and more.

Web applications, such as the Climate Monitor for Northern Germany, play a central role in our efforts to transfer scientific knowledge to our stakeholders. Originally released in 2014, the monitor comprehends data derived from freely available climate datasets of the last few decades, such as CoastDat, eOBS, CRU TS and more. We provide derived climate information for the most-requested parameters, namely temperature, precipitation, humidity, wind, cloudiness, and vegetation but also analyze indices on extremes such as heat, severe rain fall and storms. We answer the questions of our regional stakeholders, e.g. “How does a changing climate affect our interests?”, by visualizing spatial averages (municipality to state-level scale), as well as comprehensive, interactive and comparable time-series and a descriptive interpretation of both. This tool has been proven to be a valuable asset in stakeholder communication and allows everyone to access crucial climate information for their region of interest.

In our latest release we take user needs into account and redesign the front-end using a mixture of open-source libraries and OGC services provided by ESRI. With the re-design we introduce interactive webmaps and apps, intended to simplify navigability through this complex theme and its far-reaching visualization collection. We aim to increase user engagement through a familiar user interface, consistent with similar web applications. Our data processing pipelines have been streamlined to make the results conform to the FAIR principles. Besides the visual representation of the results, we provide download options for the raw data, and the computational methods are published open-source in the form of Jupyter notebooks. We focus on ease of maintenance, accessibility and on instantaneous publication of the latest results. In this presentation we highlight the workflows and experiences behind creating this user centric web tool, and discuss where we see the benefits of integrating web tools in knowledge transfer.

How to cite: Benninghoff, M., Sommer, P. S., Baldewein, L., and Meinke, I.: Making complex climate information available for a stakeholder dialogue: the Climate Monitor for Northern Germany, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11191, https://doi.org/10.5194/egusphere-egu23-11191, 2023.

EGU23-12762 | ECS | PICO | ESSI4.1

FORESTER – Interactive visualization of tree-based machine learning

David Strahl, Robert Reinecke, and Thorsten Wagener

Visualizations are crucial for machine learning as they allow practitioners to understand, analyze, and communicate their models. They help interpret complex models by providing a graphical representation of both data and model performance. Visualizations can be used to understand the underlying patterns and trends in the data, identify biases and errors, and diagnose problems with the model. They also help in communicating the results of the model to a non-technical audience by providing an intuitive and interactive way to present the findings.

Tree-based machine learning methods, such as Classification and Regression Trees or Random Forest, are well-established and widely used in the Earth Sciences. However, visualization tools provided by common machine-learning environments in Python, R, or Matlab often provide graphical representations that could be more visually appealing or helpful in conveying a clear message.

Here we present FORESTER, a web-based and open-source software that produces visually appealing tree-based visualizations. Forester produces publication-ready plots that are, at the same time, interactive figures that can guide the user in interpreting the model. Visualizations can be streamlined to the user's requirements and offer a wide variety of insightful techniques. This makes Forester a promising alternative to currently used environments. Forester is open to collaborations, so we hope it will be extended within the Earth Science community and beyond, proving useful in other machine-learning-related fields.

How to cite: Strahl, D., Reinecke, R., and Wagener, T.: FORESTER – Interactive visualization of tree-based machine learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12762, https://doi.org/10.5194/egusphere-egu23-12762, 2023.

EGU23-14349 | PICO | ESSI4.1

EPOS-Norway Portal

Jan Michálek, Kuvvet Atakan, Lars Ottemøller, Øyvind Natvik, Tor Langeland, Ove Daae Lampe, Gro Fonnes, Jeremy Cook, Jon Magnus Christensen, Ulf Baadshaug, Halfdan Pascal Kierulf, Bjørn Ove Grøtan, John Dehls, Odleiv Olesen, and Valerie Maupin

The European Plate Observing System (EPOS) is a European project about building a pan-European infrastructure for accessing solid Earth science data, governed now by EPOS ERIC (European Research Infrastructure Consortium). The EPOS-Norway project (EPOS-N; RCN-Infrastructure Programme - Project no. 245763) is a Norwegian project funded by National Research Council. The aim of the Norwegian EPOS e‑infrastructure is to integrate data from the seismological and geodetic networks, as well as the data from the geological and geophysical data repositories. Among the six EPOS-N project partners, four institutions are providing data – University of Bergen (UIB), - Norwegian Mapping Authority (NMA), Geological Survey of Norway (NGU) and NORSAR.

In this contribution, we present the EPOS-Norway Portal as an online, open access, interactive tool, allowing visual analysis of multidimensional data. It supports maps and 2D plots with linked visualizations. Currently access is provided to more than 300 datasets (18 web services, 288 map layers and 14 static datasets) from four subdomains of Earth science in Norway. New datasets are planned to be integrated in the future. EPOS-N Portal can access remote datasets via web services like FDSNWS for seismological data and OGC services for geological and geophysical data (e.g. WMS). Standalone datasets are available through preloaded data files. Users can also simply add another WMS server or upload their own dataset for visualization and comparison with other datasets. This portal provides unique way (first of its kind in Norway) for exploration of various geoscientific datasets in one common interface. One of the key aspects is quick simultaneous visual inspection of data from various disciplines and test of scientific or geohazard related hypothesis. One of such examples can be spatio-temporal correlation of earthquakes (1980 until now) with existing critical infrastructures (e.g. pipelines), geological structures, submarine landslides or unstable slopes.

The EPOS-N Portal is implemented by adapting Enlighten-web, a server-client program developed by NORCE. Enlighten-web facilitates interactive visual analysis of large multidimensional data sets, and supports interactive mapping of millions of points. The Enlighten-web client runs inside a web browser. An important element in the Enlighten-web functionality is brushing and linking, which is useful for exploring complex data sets to discover correlations and interesting properties hidden in the data. The views are linked to each other, so that highlighting a subset in one view automatically leads to the corresponding subsets being highlighted in all other linked views.

How to cite: Michálek, J., Atakan, K., Ottemøller, L., Natvik, Ø., Langeland, T., Lampe, O. D., Fonnes, G., Cook, J., Christensen, J. M., Baadshaug, U., Kierulf, H. P., Grøtan, B. O., Dehls, J., Olesen, O., and Maupin, V.: EPOS-Norway Portal, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14349, https://doi.org/10.5194/egusphere-egu23-14349, 2023.

EGU23-14352 | ECS | PICO | ESSI4.1

Data preparation for the development of a user-friendly, free, online, and interactive platform for the visualization and analysis of interdisciplinary data

Ingrid Linck Rosenhaim, Sebastian Mieruch-Schnülle, and Reiner Schlitzer

The MOSAiC Expedition 2019/20 (https://mosaic-expedition.org) brought together scientists from different research institutes around the globe for a year in the Central Arctic. They collected an incredible amount of data to expand the understanding of the Arctic, its distinct features, and the consequences of a changing climate. Since January 2023, the data collected during the MOSAiC Expedition is available as Open Source in the long-term archive Pangaea (https://pangaea.de) for anyone who would like to learn and study the Arctic Ocean and its features. The M-VRE webODV Project (https://mosaic-vre.org) aims to offer an interactive online exploration, visualization, and analysis of the MOSAiC data in a user-friendly environment. In the M-VRE webODV (https://mvre.webodv.cloud.awi.de), these data are presented as Data Collections that consist of similar datasets aggregated into singular collections and Interdisciplinary Collections, where complementary datasets are aggregated into collections. However, for the MOSAiC data to be explored, visualized, and analyzed with webODV, it has to be converted from the tab file format used in the Pangaea archive to an ODV readable format. Therefore, the data is converted through a six steps process: search, filtering, download of datasets, data aggregation, metadata preparation, and data conversion into the ODV format. Although several datasets after those steps are ready to be uploaded to the M-VRE webODV, other datasets need special and individualized conversions. As a result of the data conversion process and the special conversions, the Data Collections and Interdisciplinary Collections of MOSAiC Expedition data are uploaded to the M-VRE webODV and available for user exploration, visualization, and analysis. The M-VRE webODV is since January 2023 open to the global community, and the number of available Collections is increasing.

*MOSAiC – Multidisciplinary drifting Observatory for the Study of Arctic Climate

*M-VRE webODV – MOSAiC Virtual Research Environment web Ocean Data View

How to cite: Linck Rosenhaim, I., Mieruch-Schnülle, S., and Schlitzer, R.: Data preparation for the development of a user-friendly, free, online, and interactive platform for the visualization and analysis of interdisciplinary data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14352, https://doi.org/10.5194/egusphere-egu23-14352, 2023.

EGU23-15375 | PICO | ESSI4.1

O3as: Ozone trend visualisations and return dates developed within within EOSC-synergy

Tobias Kerzenmacher, Valentin Kozlov, Borja Esteban Sanchis, Ugur Cayoglu, Marcus Hardt, and Peter Braesicke

The O3as service is a tool designed to support the assessment of atmospheric ozone levels and trends. It was developed as one of the thematic services of the EOSC-Synergy project. It allows for the analysis of large datasets from chemistry-climate models and presents the information in a user-friendly format for a broad range of users, including scientists, pupils, and interested citizens. The service utilizes a unified approach to process the data, employs CF conventions for homogenization, and generates figures that can be published or downloaded as csv files. It was developed as part of the EOSC-Synergy project, and it runs on a cloud-based, containerized architecture orchestrated by Kubernetes and HPC resources, and uses the Large Scale Data Facility (LSDF) at the KIT for data storage. The service is developed with best software practices, including quality assurance, continuous integration and delivery, and compliance with the FAIR principles.

This presentation will focus in particular on the architecture and functionality of the O3as service, with an example demonstration of its usage.

How to cite: Kerzenmacher, T., Kozlov, V., Esteban Sanchis, B., Cayoglu, U., Hardt, M., and Braesicke, P.: O3as: Ozone trend visualisations and return dates developed within within EOSC-synergy, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15375, https://doi.org/10.5194/egusphere-egu23-15375, 2023.

EGU23-17202 * | ECS | PICO | ESSI4.1 | Highlight

Visualising high-resolution global land use change of six decades

Karina Winkler, Richard Fuchs, Mark Rounsewell, and Martin Herold

People have shaped the land surface for many centuries. However, the global expansion of land use is fuelling climate change and threatening biodiversity. At the same time, there is an ever-increasing need to supply our growing world population with food, energy and materials. Despite the crucial role of land use for solving global sustainability challenges, existing data on long-term land use change lacks the spatial, temporal and thematic detail to comprehensively capture the changes in its full dynamics.

We synergistically combined multiple open data streams (remote sensing-based land cover maps, land use reconstructions and statistics) to examine the spatio-temporal patterns of global land use change of global land use change. For this, we developed the HIstoric Land Dynamics Assessment+ (HILDA+), a modelling framework providing data-derived, annual gross changes between six land use/cover categories (urban, cropland, pasture/rangeland, forest, unmanaged grass/shrubland, sparse/no vegetation) at a spatial resolution of 1km and for a reference period of 1960-2020. Derived land use/cover maps are published as Open Data.

In this live demo, we present our findings through an interactive map viewer - a visualisation of global land use change of the past six decades. The data visualisation builds on the open-source server GeoServer. We will interactively explore the extent of land use change and its diverging patterns across the globe.

How to cite: Winkler, K., Fuchs, R., Rounsewell, M., and Herold, M.: Visualising high-resolution global land use change of six decades, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17202, https://doi.org/10.5194/egusphere-egu23-17202, 2023.

EGU23-894 | Orals | ESSI4.2

Remote Sensing based Evapotranspiration Estimation and Sensitivity Analysis

Mahesh Kumar Jat, Ankan Jana, and Mahender Choudhary

Evapotranspiration (ET) is an important factor to calculate the water loss to the atmosphere and water demand for crops. Global and regional estimates of daily evapotranspiration are essential for our understanding of the hydrologic cycle. Remote sensing methods have many advantages in estimating daily ET for a large heterogeneous area. In the present study, the sensitivity of ET with respect to different remote sensing-derived variables has been quantified while using the energy balance algorithm for land (SEBAL) method to estimate daily ET. The sensitivity of SEBAL-based ET has been determined for NDVI, LST, albedo, and SAVI using Extended Fourier Amplitude Sensitivity Test (eFAST) method. Relative changes in ET estimates for a range ± 20% of important parameters i.e., NDVI, albedo, SAVI, and LST have been determined and the sensitivity coefficient was estimated. Further, the sensitivity of SEBAL estimated ET has been investigated for different land cover and land use classes i.e., cropland, barren land, settlement, forest, and sparse vegetation. Results show that ET is significantly sensitive to the albedo and LST, however, other LULC classes have a different level of sensitivity. For cropland, ET is sensitive to NDVI. The sensitivity coefficient also indicates a significant effect of albedo and LST on the SEBAL estimated ET. For cropland, a 20% decrease in albedo and LST resulted in a 4.24% and 4.19% reduction in ET, and a 20% increase leads to an increase in ET by 13% and 5.53%, respectively. For sparse vegetation, a 20% reduction in albedo leads to an increase in ET by 7.46% while a 20% increase in albedo may reduce the ET by 15.70%. SAVI has an inverse relationship with ET for forest, barren land, settlement, and sparse vegetation as compared to other variables. The study concludes that SEBAL estimated ET is sensitive to albedo and LST significantly. The study helps in understanding the scope of uncertainty in remote sensing-based ET estimation.

How to cite: Jat, M. K., Jana, A., and Choudhary, M.: Remote Sensing based Evapotranspiration Estimation and Sensitivity Analysis, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-894, https://doi.org/10.5194/egusphere-egu23-894, 2023.

EGU23-2551 | ECS | Posters on site | ESSI4.2

Impact of hail events on agriculture: A remote sensing-based analysis of hail damage in the context of climate change

Vanessa Streifeneder, Daniel Hölbling, and Zahra Dabiri

In the project HAGL (“Impact of hail events on agriculture: A remote sensing-based analysis of hail damage in the context of climate change”), we analyse the effects of hail damage on agriculture. In the context of climate change and the associated increased risk of extreme weather events to society and the economy, this project deals with a locally catastrophic natural hazard that causes high costs, namely hail. Hail, combined with severe storms, causes millions of Euros of damage to agriculture every year. The influence of climate change on local weather patterns (e.g. thunderstorms) is still relatively unexplored, but early evidence points to an increase in weather patterns causing hail and an increase in hailstone sizes. In Austria, especially southeastern Styria with its various crops is frequently affected by extreme hail events. Yield losses due to hail damage can be existence-threatening for farmers, which is why an effective damage assessment is of great interest.

We aim to develop an efficient method to determine the damage to agriculture caused by hail using various remote sensing data. Through a spatial hotspot analysis, we identify regions in southeastern Styria that are particularly affected by hailstorms to test and validate our method. We perform a combined analysis of Sentinel-2 optical and Sentinel-1 synthetic aperture radar (SAR) data using object-based image analysis (OBIA) methods and different vegetation indices derived from the multispectral data as well as radar backscatter signals to detect hail damage. Finally, we aim to create a damage categorisation that could support insurance work in the event of a disaster and make it more efficient by providing a first estimation of the damage before an on-side assessment is conducted. Especially for large agricultural fields, this would save time and resources by making it possible to prioritise areas with high damage and organise the fieldwork of insurance employees accordingly.

How to cite: Streifeneder, V., Hölbling, D., and Dabiri, Z.: Impact of hail events on agriculture: A remote sensing-based analysis of hail damage in the context of climate change, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2551, https://doi.org/10.5194/egusphere-egu23-2551, 2023.

EGU23-2620 | ECS | Orals | ESSI4.2

Satellite times-series analysis and assessment of the BFAST algorithm to detect possible abrupt changes in forest seasonality utilising Sentinel-1 and Sentinel-2 data. Case study: Paphos forest, Cyprus

Christos Theocharidis, Ioannis Gitas, Chris Danezis, and Diofantos Hadjimitsis

Climate change can be described as the dominant factor all these decades concerning changes in forest phenology while, at the same time, temperature affects the development time (Barrett & Brown, 2021; X.Zhou et al., 2020; Suepa et al., 2016). Satellite image-time series data have proven their value regarding forest health and forest phenology observation. Monitoring continuous plant phenology is critical for the ecosystem at a regional and global level since the high sensitivity of vegetation life cycle to climate change; the so-called phenophases are essential biological indicators to comprehend how climate change has impacted these ecosystems and how this will change the ensuing years. (Buitenwerf, Rose, and Higgins 2015; Johansson et al. 2015).

This study conducts a time-series analysis using the breaks for additive season and trend (BFAST) time-series decomposition algorithm, to detect possible abrupt changes in forest seasonality and the impacts of extreme climatic events on forest health, examining Sentinel-1 and Sentinel-2 data for the period 2017-2021. The backscatter coefficient from Sentinel-1, Normalised Difference Moisture Index (NDMI), Enhanced Vegetation Index (EVI), and Green Chlorophyll Index (GCI) were created by Sentinel-2 and assessed to find possible correlations between them. All the satellite time-series data derived through the Google Earth Engine platform.

The study area is the Paphos Forest, managed by the Department of Forest which could be described as a representative Mediterranean forest; thus, it is vital to monitor it because Mediterranean forests are expected to experience the first climate change in Europe. More specifically, the study focus on the Nortwest, West and Southwest areas of the Paphos Forest since the SAR images are from Ascending orbit. Moreover, Paphos forest has unspoiled vegetation, and a highly reduced number of forest wildfires have occurred in recent years, favouring the reliability of the research's results.

Acknowledgements

The authors acknowledge the 'EXCELSIOR': ERATOSTHENES: Excellence Research Centre for Earth Surveillance and Space-Based Monitoring of the Environment H2020 Widespread Teaming project (www.excelsior2020.eu). The 'EXCELSIOR' project has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No 857510, from the Government of the Republic of Cyprus through the Directorate General for the European Programmes, Coordination and Development and the Cyprus University of Technology.

How to cite: Theocharidis, C., Gitas, I., Danezis, C., and Hadjimitsis, D.: Satellite times-series analysis and assessment of the BFAST algorithm to detect possible abrupt changes in forest seasonality utilising Sentinel-1 and Sentinel-2 data. Case study: Paphos forest, Cyprus, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2620, https://doi.org/10.5194/egusphere-egu23-2620, 2023.

EGU23-3514 | ECS | Orals | ESSI4.2

England Peat Map: The challenges of using Earth observation data and machine learning approaches at the national scale

Alex Hamer, Sam Dixon, Christoph Kratz, Craig Dornan, Chris Miller, Michael Prince, Charlie Hart, Tom Hunt, and Andrew Webb

The world’s peatlands are our largest terrestrial carbon store whilst also providing a sustainable source of drinking water, a haven for wildlife and storing a record of our past. The England Peat Map aims to provide baseline maps for the extent, depth, and condition of peaty soils in England by 2024. This will enable targeting of future restoration, support nature recovery, improve greenhouse emissions reporting and natural capital accounting.

The maps will be created using a combination of multi-scale Earth observation imagery (satellite and airborne), existing and new ecological field survey data and machine/deep learning. Extent and depth mapping is implemented with random forest models and uses Sentinel satellite imagery and airborne LiDAR in combination with other ancillary datasets (e.g., geology and climate) for prediction. Assessment of peatland condition requires looking at these landscapes in different ways. Land cover mapping is used as a proxy for condition by targeting reflective classes for condition (e.g., Sphagnum, heather, and bare peat). Random forest and convolutional neural network (CNN) models are used in combination with Sentinel satellite imagery, aerial photography, and airborne LiDAR to produce national outputs. Mapping erosion/drainage features (grips, gullies and haggs) across the landscape is essential in understanding the underlying hydrological condition of the peatland and promising results have been achieved using CNNs with LiDAR and aerial photography. The final aspect of assessed condition is the movement of peat, also termed bog breathing, and is measured using Sentinel-1 Interferometric Synthetic Aperture Radar (InSAR). This opportunity is a result of novel in-situ peat movement cameras being installed across pilot sites to provide ground truth data.

The final maps will be released free of charge under an open UK government license, allowing wider application and new opportunities for use compared with currently available datasets. For example, these baseline maps have the potential to contribute towards national peatland monitoring to address further decline of peatland habitats and target restoration interventions to achieve cost effective results. Several challenges have occurred during the initial phase of the project such as the difficulty in licensing suitable training data and in defining what we are mapping when features lack a globally agreed definition (e.g., surface features). The talk will discuss these challenges as well as the future direction of the project and how these challenges can be overcome.

How to cite: Hamer, A., Dixon, S., Kratz, C., Dornan, C., Miller, C., Prince, M., Hart, C., Hunt, T., and Webb, A.: England Peat Map: The challenges of using Earth observation data and machine learning approaches at the national scale, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3514, https://doi.org/10.5194/egusphere-egu23-3514, 2023.

EGU23-6396 | Posters on site | ESSI4.2

Evolution of biologically active ultraviolet doses in Cyprus

Ilias Fountoulakis, Konstantinos Fragkos, Kyriakoula Papachristopoulou, Argyro Nisantzi, Antonis Gkikas, Diofantos Hadjimitsis, and Stelios Kazadzis

Solar ultraviolet (UV) radiation is only a very small fraction of the total solar radiation reaching the Earth's surface. Nevertheless, it is of exceptional significance for life on Earth. In the last two decades, significant trends in biologically effective doses have been reported over many mid-latitude sites, due to changes in total ozone, aerosols, and cloudiness. In the present study, reanalysis and satellite information for aerosols, clouds, and total ozone, from Copernicus Atmospheric Monitoring Service (CAMS), MIDAS (ModIs Dust AeroSol) dataset, Spinning Enhanced Visible and InfraRed Imager (SEVIRI) aboard Meteosat Second Generation (MSG) satellite, and Ozone Monitoring Instrument (OMI) aboard Aura satellite respectively, for the period 2004 - 2021 are used as inputs to a radiative transfer model and UV spectra are simulated for the island of Cyprus on fine spatial (0.05° x 0.05°) and temporal (15 mins) resolution. Effective doses for the production of vitamin D in the human skin, erythema, and DNA damage are calculated from the produced spectra. There is also an effort to attribute the changes in the UV biological doses to the corresponding changes in total ozone, aerosols, and cloudiness. The significant role of dust in the changes in UV doses over the island is also discussed.

Acknowledgments: The authors acknowledge the ‘EXCELSIOR’: ERATOSTHENES: EΧcellence Research Centre for Earth Surveillance and Space-Based Monitoring of the Environment H2020 Widespread Teaming project (www.excelsior2020.eu). The ‘EXCELSIOR’ project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No 857510, from the Government of the Republic of Cyprus through the Directorate General for the European Programmes, Coordination and Development and the Cyprus University of Technology. The Department of Meteorology of the Republic of Cyprus is acknowledged for providing the ground-based data for the validation of the modelled quantities.

How to cite: Fountoulakis, I., Fragkos, K., Papachristopoulou, K., Nisantzi, A., Gkikas, A., Hadjimitsis, D., and Kazadzis, S.: Evolution of biologically active ultraviolet doses in Cyprus, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6396, https://doi.org/10.5194/egusphere-egu23-6396, 2023.

EGU23-7269 | Posters on site | ESSI4.2

Modelled-based Photosynthetically Active Radiation climatology for Cyprus: Validation with measurements and trends

Konstantinos Fragkos, Ilias Fountoulakis, Argyro Nisantzi, Kyriakoula Papachristopoulou, Diofantos Hadjimitsis, and Stelios Kazadzis

The visible part of the surface downward solar radiation (400 – 700 nm) known as Photosynthetically Active Radiation (PAR) is a key parameter for many land process models and terrestrial applications. More specifically, it is a critical ecological factor affecting agriculture productivity, ecosystem-atmosphere energy, CO2 fluxes, canopy architecture in forest ecosystems, and the growth of phytoplankton, among others.

Despite its high importance, PAR measurements are rather scarce and no relevant worldwide radiometric networks for this quantity, in contrast with other actinometric quantities (e.g., global horizontal irradiance), exist. For these reasons, PAR levels are mostly estimated by satellite observations and modeling techniques.

In the current study, we present a 16-year PAR climatology over Cyprus, based on the combined use of radiative transfer (RT) models and satellite imagery. Copernicus Atmospheric Monitoring Service (CAMS) AOD and PWV, aerosol climatology of SSA and AE based on the MACv3 aerosol climatology, Ozone – OMI data for the period 2005 – 2021, are used as input to the RT model LibRadtran to obtain the clear sky PAR levels. Consequently, the CAMS Cloud Modification Factor based on MSG images will be used to derive the PAR under all sky conditions. The derived climatology has a spatial resolution of 0.05x0.05 degrees and a temporal variation of 15 minutes, as constrained by the availability of Seviri/MSG images. Finally, the quality of the retrieved climatology is assessed by comparison with ground-based PAR measurements and PAR retrievals from measurements of GHI through relevant conversion algorithms, from quantum sensors and pyranometers that are installed in selected stations of the Meteorological Service of Cyprus.

Acknowledgments: The authors acknowledge the ‘EXCELSIOR’: ERATOSTHENES: EΧcellence Research Centre for Earth Surveillance and Space-Based Monitoring of the Environment H2020 Widespread Teaming project (www.excelsior2020.eu). The ‘EXCELSIOR’ project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No 857510, from the Government of the Republic of Cyprus through the Directorate General for the European Programmes, Coordination and Development and the Cyprus University of Technology. The Department of Meteorology of the Republic of Cyprus is acknowledged for providing ground-based data for validating the modelled quantities.

How to cite: Fragkos, K., Fountoulakis, I., Nisantzi, A., Papachristopoulou, K., Hadjimitsis, D., and Kazadzis, S.: Modelled-based Photosynthetically Active Radiation climatology for Cyprus: Validation with measurements and trends, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7269, https://doi.org/10.5194/egusphere-egu23-7269, 2023.

EGU23-8315 | Orals | ESSI4.2

A coupled GIS-MCDA approach to map the feasibility of Managed Aquifer Recharge

Anis Chekirbane, Constantinos F. Panagiotou, Aloui Dorsaf, and Stefan Catalin

Managed aquifer recharge (MAR) is a water resource management technique that involves the intentional recharge and storage of water into groundwater systems. MAR is considered an innovative nature-based solution for increasing water availability, improving water quality, and reducing surface water runoff. However, the feasibility of implementing MAR projects depends on several factors, for example recharge water availability, water demand, and the intrinsic site characteristics (e.g., geology, hydrogeology) of the area.

The current study proposes an adapted approach of MAR feasibility mapping through the integration of GIS and multi-criteria decision analysis (GIS-MCDA). The geospatial feasibility of MAR application is evaluated by considering the suitability maps of four thematic layers, namely intrinsic, water availability, non-physical and water demand. The applicability of this approach is demonstrated in Enfidha plain (NE Tunisia), for which multiple types of spatial and temporal datasets have been collected. The selection of the criteria is done based on literature studies and MAR experts’ opinions with respect to their relevance to MAR implementation, whereas the weights are determined using analytical hierarchy process (AHP). Hence, an intrinsic suitability map was established via the integration of ArcGIS software and MCDA in a web-based platform, called INOWAS (https://inowas.com/). The results suggest that more than 80% of the total plain area is considered intrinsically suitable for MAR implementation. The potential MAR feasibility of the demonstration site is expected to be established by overlaying the suitability maps of the three thematic layers.

In addition to standardizing the process of MAR feasibility, the derived maps constitute an asset in the process of planning and implementing effective MAR projects that help to ensure the long-term sustainability of water resources in the Sahel region of Tunisia.

Acknowledgement

This work is funded by National Funding Agencies from Germany (Bundesministerium für Bildung und Forschung – BMBF), Cyprus (Research & Innovation Foundation – RIF), Portugal (Fundação para a Ciência e a Tecnologia – FCT), Spain (Ministerio de Ciencia e Innovación – MCI) and Tunisia (Ministère de l’Enseignement Supérieur et de la Recherche Scientifique – MESRS) under the Partnership for Research and Innovation in the Mediterranean Area (PRIMA). The PRIMA programme is supported under Horizon 2020 by the European Union’s Framework for Research and Innovation.

How to cite: Chekirbane, A., F. Panagiotou, C., Dorsaf, A., and Catalin, S.: A coupled GIS-MCDA approach to map the feasibility of Managed Aquifer Recharge, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8315, https://doi.org/10.5194/egusphere-egu23-8315, 2023.

EGU23-10910 | Posters on site | ESSI4.2

Spatio-temporal prediction of aerosol optical thickness using machine learning and spatial analysis techniques

Seonghun Pyo, Kwonho Lee, and Seunghan Park

Emission sources, meteorology, and topography are the major factors that make it difficult to predict aerosols in space and time. In this study, the moderate resolution imaging spectro-radiometer (MODIS) aerosol optical thickness (AOT) and the surface meteorology observed in Korea have been used to predict spatio-temporal AOT by using the machine learning with spatial analysis techniques. This method enables timeseries based prediction and spatial distribution modeling, and allows modeling values where there are no observation points. The model results show root mean square error (RMSE) 0.33 which is smaller than the standard deviation of the observed value 0.43. Using this technique, the trend of aerosol change in the future was estimated, and it was found that the aerosol in the area of interest decreased by about 7.4%. The methodology will be useful to analyze the regional scale aerosol evaluations, air quality, and climate study.

Acknowledgement

This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(NRF-2019R1I1A3A01062804)”

How to cite: Pyo, S., Lee, K., and Park, S.: Spatio-temporal prediction of aerosol optical thickness using machine learning and spatial analysis techniques, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10910, https://doi.org/10.5194/egusphere-egu23-10910, 2023.

EGU23-12336 | ECS | Posters on site | ESSI4.2

Assessment of airborne remote sensing data for high-resolution mapping of invasive Prosopis spp. in a semi-arid environment in Kenya

Ilja Vuorinne, Janne Heiskanen, Ian Ocholla, Rose Kihungu, and Petri Pellikka

Invasive alien plant species are a major global problem threatening biodiversity and livelihoods and their mapping is needed for understanding their distribution dynamics, and for facilitating control and eradication measures. Prosopis spp., a fast-growing woody species native to South America, have been widely introduced into the tropics to restore degraded areas, but they have spread uncontrollably. For example, in East Africa, Prosopis spp. have invaded rangelands and thus decreased plant diversity and affected the livelihoods of pastoral communities. Remote sensing instruments mounted on an aircraft can be used to map such species and especially a combination of different sensors holds a potential for accurate detection.

The objective of this study was to test how a combination of airborne light detection and ranging (LiDAR), hyperspectral, and fine resolution multispectral data can be used to map Prosopis spp. in a semi-arid environment in Kenya. The remotely sensed spectral, structural, and textural features were used in a one-class machine learning algorithms to detect these species in a complex landcover. The results provide information on the use of different airborne remote sensing instruments and their combination in mapping woody alien invasive species and offer insights on the distribution of Prosopis spp. in the study area.

How to cite: Vuorinne, I., Heiskanen, J., Ocholla, I., Kihungu, R., and Pellikka, P.: Assessment of airborne remote sensing data for high-resolution mapping of invasive Prosopis spp. in a semi-arid environment in Kenya, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12336, https://doi.org/10.5194/egusphere-egu23-12336, 2023.

EGU23-12887 | ECS | Orals | ESSI4.2

Demonstrating the enhanced research capacity of the ERATOSTHENES Centre of Excellence for detecting ground displacements in Cyprus using advanced SAR satellite image processing techniques

Kyriaki Fotiou, Christos Theocharidis, Maria Prodromou, Stavroula Alatza, Alex Apostolakis, Athanasios V. Argyriou, Thomaida Polydorou, Constantinos Loupasakis, Charalampos Kontoes, Diofantos Hadjimitsis, and Marios Tzouvaras

In the last few years, the consequences of the active landslides that occurred in Cyprus have determined the necessity to provide a systematic displacement monitoring system of different areas using satellite-based techniques. Earth Observation and more specifically satellite remote sensing techniques using Synthetic Aperture Radar (SAR) imagery is the way forward exploiting the freely available Copernicus datasets that offer frequent revisit times and large spatial coverage. Moreover, Persistent Scatterer Interferometry (PSI) is among the most effective methods to monitor ground displacements, such as landslides, and assess their impact in residential areas. The purpose of this study is to showcase the use of advanced satellite image processing techniques, exploiting SAR satellite images to effectively identify ground displacements in different regions in Cyprus. The enhanced scientific and expertise skills of the ERATOSTHENES Centre of Excellence (ECoE) personnel on the application of PSI were acquired through a capacity building activity carried out by the National Observatory of Athens within the framework of EXCELSIOR project. The multi-temporal InSAR analysis in Cyprus revealed several deforming sites, which were also confirmed by the national authority responsible, i.e., the Geological Survey Department of the Ministry of Agriculture, Rural Development and Environment. ThCe villages of Pedoulas in Nicosia District and Pyrgos-Parekklisia in Limassol District are indicative deforming areas in Cyprus and were selected as test sites for further investigation. The ongoing implementation of additional InSAR techniques, fusion of remote sensing data and site visits for further validation, build a complete ground deformation monitoring system, aiming to migrate to a national scale project and serve as a valuable tool for natural hazards monitoring and risk reduction in Cyprus.

Acknowledgements

The authors acknowledge the 'EXCELSIOR': ERATOSTHENES: Excellence Research Centre for Earth Surveillance and Space-Based Monitoring of the Environment H2020 Widespread Teaming project (www.excelsior2020.eu). The 'EXCELSIOR' project has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No 857510, from the Government of the Republic of Cyprus through the Directorate General for the European Programmes, Coordination and Development and the Cyprus University of Technology.

How to cite: Fotiou, K., Theocharidis, C., Prodromou, M., Alatza, S., Apostolakis, A., Argyriou, A. V., Polydorou, T., Loupasakis, C., Kontoes, C., Hadjimitsis, D., and Tzouvaras, M.: Demonstrating the enhanced research capacity of the ERATOSTHENES Centre of Excellence for detecting ground displacements in Cyprus using advanced SAR satellite image processing techniques, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12887, https://doi.org/10.5194/egusphere-egu23-12887, 2023.

EGU23-12958 | Posters on site | ESSI4.2

Development of algorithms based on the integration of meteorological data and remote sensing indices for the identification of low-productivity agricultural areas

Rosa Coluzzi, Francesco Di Paola, Vito Imbrenda, Maria Lanfredi, Letizia Pace, Elisabetta Ricciardelli, Caterina Samela, and Valerio Tramutoli

Agricultural areas of Mediterranean regions host an extraordinary wealth of biodiversity and represent the source of income for a large population often living below the average economic conditions of the most advanced regions of Europe. In these areas, the semi-arid climates, the impact of climate change, the parcelization of land property, and the poor soils, contribute to create widespread conditions of low profitability of agricultural areas. This is likely to have an impact on the increasing occurrence of land abandonment phenomena and on growing hydrogeological risk linked to the lack of land maintenance.

The productivity estimation of these agricultural areas represents a crucial information to detect hotspots of degradation helping policy makers in taking specific actions to increase productivity and reduce migration fluxes.

In this work, realized in the framework of the ODESSA (On DEmand Services for Smart Agriculture) project (financed by the European Regional Development Fund Operational Programme 2014-2020 of Basilicata Region), the procedure adopted involves the use of climate and vegetation geospatial data, including both direct observational data (temperature, rainfall, etc.) and satellite-derived vegetation indexes. For the climatic component, we exploited a database of daily temperature and rainfall data (2000-2021) acquired by the agrometeorological network of ALSIA (Lucana Agency for Development and Innovation in Agriculture) and the CHIRPS (Climate Hazards Group InfraRed Precipitation with Station data) dataset providing rainfall data (1981-2020) at a spatial resolution of 0.05⁰ to produce different diagnostic indices able to capture low-productivity areas. We tested this procedure in two districts of Basilicata (Southern Italy): the Vulture-Melfese and the Metapontino, representing the core areas of regional agricultural specialization for vineyards and intensive fruit and vegetable crops, respectively.

How to cite: Coluzzi, R., Di Paola, F., Imbrenda, V., Lanfredi, M., Pace, L., Ricciardelli, E., Samela, C., and Tramutoli, V.: Development of algorithms based on the integration of meteorological data and remote sensing indices for the identification of low-productivity agricultural areas, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12958, https://doi.org/10.5194/egusphere-egu23-12958, 2023.

EGU23-13385 | ECS | Orals | ESSI4.2

The synergy of Sentinel missions for fire damage assessment on land surface and atmosphere: the Arakapas village case study

Maria Prodromou, Rodanthi-Elisavet Mamouri, Argyro Nisantzi, Dragos Ene, Ioannis Gitas, Kyriacos Themistocleous, Chris Danezis, and Diofantos Hadjimitsis

Fires are a widespread ecological factor since ancient times. It has a negative impact not only on the environment but on the economy, society and people. A forest fire can lead to a change in land surface, the destruction of large areas of vegetation and soil erosion. As a result, the economy is negatively affected, the balance of ecosystems is disturbed, and the flora, fauna and natural beauty are destructed. Also, biomass burning smoke affects air quality due to the large quantities of trace gases and aerosol particles that are emitted, leading to global climate change and playing a significant role in troposphere chemistry. A fundamental tool for forest fire management is the science of remote sensing. Remote sensing is commonly used for mapping burnt areas as well as for studying the effects of fire incidents and this statement is very well supported by the literature at local, regional and global levels. This study is mainly focused on burned area mapping and damage assessment on land surface and atmosphere for the case of the Arakapas fire in Cyprus. For the purposes of this study, the satellite images acquired from Sentinel-2 were used for the burnt area mapping and the fire severity estimation based on the dNBR (difference Normalized Burn Ratio) spectral index, and the Corine land cover was used for the assessment of the vegetation type that was disturbed. This event considered one of the largest in recent years is explored using data from Sentinel-5P, where carbon monoxide product is studied in the region affected by the fires. Furthermore, on the morning of the 5th of July, due to the change of wind direction, the smoke travelled from the centre of the island to the southwest, and it was detected by the multiwavelength Raman lidar, installed in Limassol. Thus, the optical properties of the smoke plume retrieved from the lidar are presented. The PollyXT-CYP lidar system of the ECoE, observed multiple layers between 500m and 2.5km with depolarization ratio of 5-8% and lidar ratio of 75sr for the upper layers.For the purposes of this study, the image processing was performed using custom scripts in the GEE (Google Earth Engine) platform with the JavaScript programming interface. The area affected by the fire was calculated to be ~40Km². The spatial distribution map of the dNBR was classified according to the USGS fire severity levels, where high dNBR values indicate a more severe fire and values near zero and negative values indicate unburned and/or decreased vegetation after the fire.

Acknowledgements

The authors acknowledge the 'EXCELSIOR': ERATOSTHENES: Excellence Research Centre for Earth Surveillance and Space-Based Monitoring of the Environment H2020 Widespread Teaming project (www.excelsior2020.eu). The 'EXCELSIOR' project has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No 857510, from the Government of the Republic of Cyprus through the Directorate General for the European Programmes, Coordination and Development and the Cyprus University of Technology.

How to cite: Prodromou, M., Mamouri, R.-E., Nisantzi, A., Ene, D., Gitas, I., Themistocleous, K., Danezis, C., and Hadjimitsis, D.: The synergy of Sentinel missions for fire damage assessment on land surface and atmosphere: the Arakapas village case study, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13385, https://doi.org/10.5194/egusphere-egu23-13385, 2023.

EGU23-14742 | ECS | Orals | ESSI4.2

Comparing reflectivity measurements between satellite- and ground-based radar observations: A case study for precipitation and drought monitoring in Cyprus

Eleni Loulli, Johannes Bühl, Silas Michaelides, Athanasios Loukas, and Diofantos Hadjimitsis

Drought is a multidimensional phenomenon that is imperceptible at its early stages, it evolves slowly and cumulatively and results to adverse consequences, for example depletion of water volumes from rivers and reservoirs, decrease of carbon uptake in vegetation etc. Cyprus is characterized by semi-arid to arid climate conditions, experiencing extensive droughts that have a negative impact on the ecosystem, the economy and the agricultural production.

Existing research on drought events in Cyprus is limited to the usage of in-situ data, mainly temperature and precipitation measurements at meteorological stations. Polarimetric weather radars can offer more detailed information regarding precipitation phenomena, especially in areas with sparse network of meteorological stations or remote areas of interest.

This study compares reflectivity measurements from the two ground-based X-band dual polarization radars of the Department of Meteorology of the Republic of Cyprus with measurements obtained from NASA’s Global Precipitation Measurement (GPM) mission.

The DPR (Dual-frequency Precipitation Radar) aboard of GPM is employed in order to derive the radar reflectivity factor with a spatial resolution of 5-25 km for 120 km wide swath. The ground-based radars operate since 2017. They scan in PPI mode at eight (8) constant elevation angles, whereas their azimuth angle varies with a spatial resolution of 0.1° and the radius of each scan is 150 km. The radar stations are located in Rizoelia, Larnaca district, and Nata, Paphos district, providing a full coverage of the island.

Satellite-based radar reflectivity values are used to adjust the ground-based radar measurements. Consequently, the adjusted values of the ground-based radar reflectivity are used as input to modelling expressions for estimating the ground-based radar precipitation.

In order to ensure that the observations are spatially coincident, we have developed a collocated grid, hereafter called universal grid, on which both the ground- and satellite-based radar observations are interpolated at the same locations. The universal grid is a three-dimensional (3D) grid with grid cell size of approximately 2500 m along both horizontal directions, whereas the vertical resolution is set equal to the height resolution of GPM, i.e. 150 m. Regarding temporal resolution, GPM overpasses Cyprus approximately once a week. For the purposes of this study, we selected overflights after the beginning of the ground-based radar operation that coincide with precipitation events.

Additionally, statistical analysis of the reflectivity measurements has been conducted to understand the relationship between the ground-based and the satellite-based datasets and identify spatio-temporal patterns of precipitation.

Acknowledgements:

The authors acknowledge the ‘EXCELSIOR’: ERATOSTHENES: EΧcellence Research Centre for Earth Surveillance and Space-Based Monitoring of the Environment H2020 Widespread Teaming project (www.excelsior2020.eu). The ‘EXCELSIOR’ project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No 857510, from the Government of the Republic of Cyprus through the Directorate General for the European Programmes, Coordination and Development and the Cyprus University of Technology.

The authors acknowledge also the Department of Meteorology of the Republic of Cyprus for the provision of the X-band radar data.

How to cite: Loulli, E., Bühl, J., Michaelides, S., Loukas, A., and Hadjimitsis, D.: Comparing reflectivity measurements between satellite- and ground-based radar observations: A case study for precipitation and drought monitoring in Cyprus, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14742, https://doi.org/10.5194/egusphere-egu23-14742, 2023.

EGU23-15837 | ECS | Posters on site | ESSI4.2

Exploring the benefits of building a data cube towards the efficient risk monitoring and assessment of cultural heritage assets

Georgios Leventis, Georgios Melillos, Athanasios Argyrioy, Ioannis Varvaris, Zampella Pittaki, Kyriacos Themistocleous, and Diofantos Hadjimitsis

The Eastern Mediterranean, Middle East, and North Africa (EMMENA) region encompasses three continents (Europe, Asia, and Africa). The region is not only strategically vital for political and military forces, but it is also archaeologically and culturally significant due to the large amount of cultural wealth, due to being an important crossroad in archaic times for various civilizations [1]. However, the cultural assets of the region are often susceptible to risks associated either to nature (like land deformation, earthquakes etc.) or to human activity (looting, war atrocities, etc.).

To protect cultural heritage in uncertain crisis scenarios, it is critical to recognize any risk situation early and support the decision-makers and cultural stakeholders with timely, accurate and relevant information, while raising at the same time public awareness on important issues that pertain to the cultural destruction, alteration and/or looting. Towards the end of responding properly in due time to any threats, ERATOSTHENES Centre of Excellence through its two departments; Big Earth Data Analytics and Cultural Heritage at the current work showcases its efforts in building and exploiting a cultural data cube based and building upon the open-source project called Open Data Cube [2]. Taking advantage of such endeavor, centre’s researchers are able to store, extract and analyse geospatial and satellite data, which due to their cube-shaped transformation can be accessed quickly thus providing a better understanding of any critical risk situations that might affect possible cultural assets. As the scale and pattern of occurrence fluctuate based on the type of disaster, as well as the extent of damage may vary from time to time depending on regional features, the timing of incident(s) and of the response, the proposed work encapsulates various forms of data acquired throughout an entire risk scenario (prior to the event, during the event and post to the event), to ensure the best possible assessment of any ongoing risk(s).

It becomes perceivable that damaged cultural assets cannot be restored to their former condition, hence is crucial to preserve them as much as possible and increase the resilience of cultural properties by reducing the harm brought on by disaster scenarios. Fostering on geospatial advances, the particular work aspires to become a common ground and valuable tool for efficient incident management within the EMMENA region starting from the field of Cultural Heritage and extending to others (i.e., marine security, agriculture, water resources management etc.).

Acknowledgements

The authors acknowledge the ‘EXCELSIOR’: ERATOSTHENES: EΧcellence Research Centre for Earth Surveillance and Space-Based Monitoring of the Environment H2020 Widespread Teaming project (www.excelsior2020.eu). The ‘EXCELSIOR’ project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No 857510, from the Government of the Republic of Cyprus through the Directorate General for the European Programmes, Coordination and Development and the Cyprus University of Technology.

References

[1] - Longuet, R.: Encyclopaedia of the History of Science, Technology, and Medicine in Non-Western Cultures. Springer, Netherlands, Dordrecht (2008)

[2] – Open Data Cube, open-source project, https://www.opendatacube.org/about. Last accessed on 8/01/2023.

How to cite: Leventis, G., Melillos, G., Argyrioy, A., Varvaris, I., Pittaki, Z., Themistocleous, K., and Hadjimitsis, D.: Exploring the benefits of building a data cube towards the efficient risk monitoring and assessment of cultural heritage assets, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15837, https://doi.org/10.5194/egusphere-egu23-15837, 2023.

EGU23-16436 | Orals | ESSI4.2

Monitoring natural and geo- hazards at cultural heritage sites using Earth observation: the case study of Choirokoitia, Cyprus

Kyriacos Themistocleous, Kyriaki Fotiou, and Marios Tzouvaras

Monitoring natural hazards due to climate change and natural hazards at cultural heritage sites facilitates the early recognition of potential risks and enables effective conservation monitoring and planning. Landslides, earthquakes, rock falls, ground subsidence and erosion are the predominant natural hazards in Cyprus, which pose serious disadvantages to cultural heritage sites as well as potential danger to visitors. To identify and monitor natural hazards and environmental displacements Earth observation techniques, such as SAR, can be used in combination with in-situ methods.

The EXCELSIOR H2020 Widespread Teaming project under Grant Agreement No 857510 and the TRIQUETRA project Horizon Europe, Grant Agreement No. 101094818 will study the use of Earth observation techniques for examining cultural heritage sites. The TRIQUETRA project will examine Choirokoitia, Cyprus as a pilot project using these techniques. Choirokoitia is a UNESCO World Heritage Site and is one of the best-preserved Neolithic sites in the Mediterranean. The project will examine the potential risk of rockfall at the Choirokoitia site, as the topology of the site is vulnerable to movements as a result of extreme climate change as well as of daily/seasonal stressing actions. Rockfall poses a significant danger to visitor safety as well as damage to cultural heritage sites.

As well, the Choirokoitia site will be used to detect and analyse natural hazards induced ground deformation based on InSAR ground motion data and field survey techniques for cultural heritage applications. InSAR data, satellite positioning and conventional surveying techniques will be employed to measure micromovements, while other techniques such as UAVs and photogrammetry will be used for documentation purposes and 3D modelling comparisons. In order to identify and monitor natural hazards and their severity, a permanent GNSS station and corner reflector, as well as analysing multitemporal SAR satellite data will be used to estimate the rate of land movement. SAR monitoring provides the opportunity to identify deformation phenomena resulting from natural hazards for monitoring and assessing potential hazards using remote sensing techniques to measure and document the extent of change caused by the natural and/or geo-hazards. PSI (Persistent Scatterer Interferometry) analysis can be used in the wider area to determine potential displacements.

The study is expected to lead towards the systematic monitoring of geohazards, and more specifically those of ground deformation and rock falls to facilitate the early recognition of potential risks and enable effective conservation monitoring and planning. The methodology can be used to monitor cultural heritage sites worldwide which are vulnerable to natural hazards.

How to cite: Themistocleous, K., Fotiou, K., and Tzouvaras, M.: Monitoring natural and geo- hazards at cultural heritage sites using Earth observation: the case study of Choirokoitia, Cyprus, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16436, https://doi.org/10.5194/egusphere-egu23-16436, 2023.

EGU23-16794 | Orals | ESSI4.2

Evaluating the influence of human induced landscape alterations on ecosystem services in semiarid regions of India.

Vinay Shivamurthy, Thokala Manoj, Kamera Arun Kumar, Maddela Harsha Vardhan, Sangannagari Lavanya, and Sankalamaddi Manasa

In the Anthropocene, with human centric planning, the landscapes are continually altered endangering the existence of biota, triggering climate changes, affecting the ecosystem services provided by the regional landscapes. However in special cases, meticulous planning and prioritized alterations of landscape has aided in improving the regional economy and the services provided by them. In the current communication we spatially evaluate the influence of irrigation projects on the lentic ecosystems and agrarian ecosystems at regional scale. Karimnagar, located in Telangana State India, along with 25km buffer from the city center was analyzed. Being in the semiarid zones, Karimnagar experience sever temperature during summer. Spatiotemporal variation in Zaid cropping pattern and water bodies were studied using Landsat series satellite data for the past 5 decades i.e., between 1973 to 2022. Indices based methods such as Normalized Difference Vegetation Index and modified Normalized Difference Water index were used followed by segmentation to determine the areas under Zaid cropping and extent of water. It was evident that during 1973 area under Zaid crops were as low as 231km² with water bodies about 1.7km². with commission of lower Manair in the year 1985, downstream regions of the reservoir showed large scale improvement i.e., the lakes were rejuvenated and the area under Zaid cropping improved significantly. Area under Zaid agriculture improved by four folds i.e., over 1000km² and water bodies increased to 53km². In the recent past, Mid Manair was commissioned in the year 2018 post which area under water has increase to 113km² and area under Zaid cropping has increased to 1569km². Post Lower Manair and Mid Manair Projects, most of the lentic ecosystems in the study area have become perennial catering to agrarian, domestic and environment. The Agriculture Ecosystem Service Value in the study area particularly due the Zaid Cropping has increased from 34 Million US$ in 1973 to ~128Million US$ after commission of Lower Manair and the same has increased to 235 Million US$ by 2022, like wise ecosystem services of lentic ecosystems have increased from 0.59Million US$ in 1973 to 39.57 Million US$ in 2022. The results indicates that with sensible planning and development, both society and regional environs get mutually benefitted thus ensuring superior wellbeing.

How to cite: Shivamurthy, V., Manoj, T., Arun Kumar, K., Harsha Vardhan, M., Lavanya, S., and Manasa, S.: Evaluating the influence of human induced landscape alterations on ecosystem services in semiarid regions of India., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16794, https://doi.org/10.5194/egusphere-egu23-16794, 2023.

EGU23-16909 | ECS | Posters virtual | ESSI4.2

Sea Surface Temperature and Ocean Wind Speed Data in the Cyprus region from Sentinel-3 using Sentinel Application Platform (SNAP) and Arc GIS Pro.

Eleftheria Kalogirou, George Melillos, Diofantos Hadjimitsis, and Despoina Makri

The ability to measure sea surface temperature allows us to observe the global system and quantify ongoing weather and climate change. Several industries are particularly affected by increased SST the shipping industry, the offshore oil and gas industry, the fishing industry, etc. Knowledge of ocean wind behaviour will enable ship masters to choose routes that avoid heavy seas or high headwinds that may slow the ship's travel, increase fuel consumption, or possibly cause damage to vessels and loss of life. This paper aims to realise the Cyprus region's sea surface temperature and wind speed data. The comparison of results obtained using Sentinel Application Platform (SNAP) and ArcGIS Pro, shows that both tools can be used to realise Sea Surface Temperature and Ocean Wind Speed Data and give satisfactory results.

Keywords: Sea surface temperature, Ocean Wind Speed Data, Sentinel-3, SNAP, ArcGIS Pro.

How to cite: Kalogirou, E., Melillos, G., Hadjimitsis, D., and Makri, D.: Sea Surface Temperature and Ocean Wind Speed Data in the Cyprus region from Sentinel-3 using Sentinel Application Platform (SNAP) and Arc GIS Pro., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16909, https://doi.org/10.5194/egusphere-egu23-16909, 2023.

EGU23-17144 | Orals | ESSI4.2

Comparison of three algorithms for tree crown area and available pruning biomass monitoring

Sofia Fidani, Ioannis Maroufidis, Stavros Chlorokostas, Ioannis N. Daliakopoulos, Dimitrios Papadimitriou, Ioannis Louloudakis, Georgios Daskalakis, Betty Charalambopoulou, and Thrassyvoulos Manios

Fast and rigorous assessment of tree characteristics from earth observation products has many environmental applications, including monitoring of the canopy biomass available for pruning and utilisation as soil amendment or energy source. Here we explore the efficiency of three supervised classification algorithms in assessing canopy area of olive trees, the staple food crop of the Mediterranean that annually produces an estimated 2,82 Μt ha^-1 of residual biomass (Velázquez-Martí et al., 2011) which is currently largely unexploited and often an environmental hazard due to on-site fires. The algorithms include (a) a thresholding algorithm (Daliakopoulos et al., 2009) processing Normalized Difference Vegetation Index values, (b) a supervised machine learning algorithm comprised on an Artificial Neural Network (ANN) with 4 hidden layers, and (c) the AdaBoost supervised deep learning algorithm. Following Yang et al. (2009), the latter two methods use image colour, texture, and entropy as inputs. Ground truth was developed by manually producing a binary mask where pixels depicting tree crown were marked with 1 and otherwise 0, and classification results were evaluated using the Dice similarity coefficient (DSC; Nisio et al., 2020). The three algorithms were tested on assessing olive tree crown projected surface area on a WorldView II image of resolution 0.5 × 0.5 m of a rural area of Heraklion, Crete, Greece, acquired on November 10, 2020. Masking was performed in 42 olive tree plots including a total of 1,080 olive trees, including on-site visual validation of the masking results. Results show that the ANN performed better than AdaBoost and NDVI thresholding, scoring 81.98%, compared to 75.06 and 70.03%, respectively. The trained ANN is currently used to provide olive tree canopy estimates, used as input to assess canopy biomass available for pruning for the CompOlive system, an online platform that facilitates matchmaking of olive tree farms, olive mills, and mobile composting equipment, to optimise on-farm compost production and utilisation.

Acknowledgements

This research is co-financed by the European Union and Greek national funds through the Operational Program CRETE 2014-2020, under Project “CompOlive: Integrated System for the Exploitation of Olive Cultivation Byproducts Soil Amendments” (KPHP3-0028773).

References

Daliakopoulos, I. N., Grillakis, E. G., Koutroulis, A. G., & Tsanis, L. K. (2009). Tree Crown Detection on Multispectral VHR Satellite Imagery. Photogrammetric Engineering and Remote Sensing, 75(10), 1201–1211. https://doi.org/10.14358/PERS.75.10.1201

Velázquez-Martí, B., Fernández-González, E., López-Cortés, I., & Salazar-Hernández, D. M. (2011). Quantification of the residual biomass obtained from pruning of trees in Mediterranean olive groves. Biomass and Bioenergy, 35(7), 3208–3217. https://doi.org/10.1016/J.BIOMBIOE.2011.04.042

Yang, L., Wu, X., Praun, E., & Ma, X. (2009). Tree detection from aerial imagery. GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems, 131–137. https://doi.org/10.1145/1653771.1653792

How to cite: Fidani, S., Maroufidis, I., Chlorokostas, S., Daliakopoulos, I. N., Papadimitriou, D., Louloudakis, I., Daskalakis, G., Charalambopoulou, B., and Manios, T.: Comparison of three algorithms for tree crown area and available pruning biomass monitoring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17144, https://doi.org/10.5194/egusphere-egu23-17144, 2023.

EGU23-164 | ECS | Orals | ITS1.8/AS5.5 | Highlight

Improvement and verification of urban extreme temperature predictions with satellite and ground observations in Austria (VERITAS-AT)

Sandro Oswald, Stefan Schneider, Maja Zuvela-Aloise, Claudia Hahn, and Clemens Wastl

Extreme temperatures, especially long-lasting heat and cold waves in urban areas, lead to thermal stress of the population and increase the number of weather-related health risks and deaths. The observed climate trend and the associated increase of extreme weather events are expected to continue in the future. Thus, the evaluation of urban thermal stress and the associated health effects becomes an important issue for urban planning and risk management. For Austrian cities, an information system for temperature warnings already exists (Weather warnings, ZAMG), which is based on the information of regional weather forecast models. However, this information does not have the required spatial resolution needed to resolve urban structure and thus to account for the urban heat island effect or cold stress situations in winter.

The aim of this project is to provide the basis for the improvement of extreme weather/thermal (dis)comfort warning systems in Austrian major cities by using high-resolution weather predictions (100 m). Therefore, the soil model SURFEX (developed by Météo France) coupled with the AROME numerical weather forecast model is applied to selected cities in Austria and used to determine the best model configuration to compute short-term forecasts (+60 hours). This method provides not a full dynamical model, but a way of pyhsical downscaling with height corrections and a high-resolution surface model.

In this project, land use parameterization will be updated and improved based on Pan-European High Resolution Layers (e.g. Urban Atlas) of the Copernicus Land Monitoring service in ECOCLIMAP (predefined land use classes for SURFEX). The model output will be verified with in-situ operational and crowd-sourced observations. Furthermore, the results will be compared to the micro-scale urban climate model MUKLIMO_3 from the German Weather Service (100 m) and various thermal infrared (TIR with 150 to 250 m) datasets. The novel modeling approach for simulating thermal stress in urban areas serves as the basis for improving the operational prediction system of extreme temperatures, for optimizing the future extreme weather warning system at the ZAMG, and for decision-making for the involved cities and their stakeholders.

How to cite: Oswald, S., Schneider, S., Zuvela-Aloise, M., Hahn, C., and Wastl, C.: Improvement and verification of urban extreme temperature predictions with satellite and ground observations in Austria (VERITAS-AT), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-164, https://doi.org/10.5194/egusphere-egu23-164, 2023.

EGU23-353 | ECS | Posters on site | ITS1.8/AS5.5

The added value of regional climate simulations at kilometre-scale resolution to describe daily wind speed: the CORDEX FPS-Convection multi-model ensemble runs over the Alps

María Ofelia Molina, Joao Careto, Claudia Gutiérrez, Enrique Sánchez, and Pedro Soares

In the recent past, the increase in computational resources allowed researchers to run simulations at increasingly horizontal and time resolutions. One such project is the World Climate Research Program’s Coordinated Regional Downscaling Experiments Flagship Pilot Studies (FPS) on convective phenomena. This FPS encompasses a set of simulations driven by the ERA-Interim reanalysis for the period from 2000-2009 (hindcast) and by the Coupled Model Intercomparison Project Phase 5 Global models for the 1996-2005 period (historical). Most models feature a horizontal resolution of 2.2 to 3 km, nested in an intermediate resolution of 12-25 km. An extended Alpine domain is considered for the simulations, due to the complexity of the mountain system together with heavy precipitation events, a large observational network and the high population density of the area. This initiative aims to build first-of-its-kind ensemble climate experiments of convective-permitting models to investigate convective processes over Europe and the Mediterranean.

In this study, the Distribution Added Value metric is used to determine the improvement of the representation of all available FPS hindcast and historical simulations for the daily mean wind speed. The analysis is performed on normalized empirical probability distributions and considers station observation data as a reference. The use of a normalized metric allows for spatial comparison among the different altitudes and seasons. This approach permits a direct assessment of the added value between the higher resolution convection-permitting regional climate model simulations against their global driving simulations and respective coarser resolution Regional Model counterparts. Although the complexity of such simulations, those not always reveal an added value. In general, results show that models add value to their reanalysis or forcing global model, but the nature and magnitude of the improvement on the representation of wind speed vary depending on the model, the spatial distribution and the season.

How to cite: Molina, M. O., Careto, J., Gutiérrez, C., Sánchez, E., and Soares, P.: The added value of regional climate simulations at kilometre-scale resolution to describe daily wind speed: the CORDEX FPS-Convection multi-model ensemble runs over the Alps, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-353, https://doi.org/10.5194/egusphere-egu23-353, 2023.

EGU23-617 | ECS | Posters on site | ITS1.8/AS5.5

Applying statistical downscaling to CMIP6 projections of precipitation for South America: Analysis of pre and post-processed simulations

Glauber Willian de Souza Ferreira, Michelle Simões Reboita, and João Gabriel Martins Ribeiro

Global Climate Models (GCMs) are fundamental for simulating future climate conditions. However, such tools have limitations like their coarse resolution, systematic biases, and considerable uncertainties and spread among the projections generated by different models. Thus, raw outputs from GCMs are insufficient for regional-scale studies, which can be solved using downscaling techniques. These methods are particularly relevant for South America (SA), given the continent's climate regimes and topographic complexity. Moreover, critical socio-economic activities developed in SA, such as rainfed agriculture and hydroelectric power generation, are highly dependent on climate conditions and susceptible to extreme events, which can lead to intense droughts or floods depending on the region. Given the background, this study aims to analyze the performance of the statistical downscaling technique Quantile Delta Mapping (QDM) applied to precipitation projections simulated by an ensemble composed of eight GCMs from the Coupled Model Intercomparison Project Phase 6 (CMIP6) for SA. In this manner, we evaluate both the original precipitation projections from the GCMs, and after applying the QDM statistical downscaling technique. Daily precipitation data from the Climate Prediction Center (CPC), with a horizontal resolution of 0.5°, and from the Multi-Source Weighted-Ensemble Precipitation version 2 (MSWEPV2), with a horizontal resolution of 0.1°, are used as a reference, so the final resolution of the GCMs (and the ensemble) projections after the QDM technique application is the same from the different validation databases. Preliminary results with CPC indicate a satisfactory performance of the technique on precipitation simulations over SA.

The authors thank the CAPES, the R&D Program regulated by ANEEL, and the companies Engie Brasil Energia and Energética Estreito for their financial support.

How to cite: de Souza Ferreira, G. W., Simões Reboita, M., and Martins Ribeiro, J. G.: Applying statistical downscaling to CMIP6 projections of precipitation for South America: Analysis of pre and post-processed simulations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-617, https://doi.org/10.5194/egusphere-egu23-617, 2023.

EGU23-1296 | Orals | ITS1.8/AS5.5

Can Statistical Downscaling based on Machine Learning compete with Regional Climate Models? A comparison for temperature, precipitation, wind, humidity and radiation over Europe under present conditions

Alfonso Hernanz, Carlos Correa, Marta Domínguez, Esteban Rodríguez-Guisado, and Ernesto Rodríguez-Camino

Two main approaches to downscale global climate projections are possible: dynamical and statistical downscaling. Both families have been widely evaluated, but intercomparison studies between the two families are scarce, and usually limited to temperature and precipitation. In this work, we present a comparison between a Statistical Downscaling Model (SDM) based on Machine Learning and six Regional Climate Models (RCMs) from EURO-CORDEX, for five variables of interest: temperature, precipitation, wind, humidity and solar radiation. The study expands at a continental scale over Europe, with a spatial resolution of 0.11^o and daily data. Both the SDM and the RCMs are driven by the ERA-Interim reanalysis, and observations are taken from the gridded dataset E-OBS. Several aspects have been evaluated: daily series, mean values and extremes, spatial patterns and also temporal aspects. Additionally, in order to analyze the intervariable consistency, a multivariable index (Fire Weather Index) derived from the fundamental variables has been included. The SDM has reached better scores than the RCMs for all the evaluated aspects with only a few exceptions, mainly related to an underestimation of the variance. After bias correction, both the SDM and the six RCMs present similar results, with no significant differences among them. Results presented here, combined with the low computational expense of SDMs and the limited availability of RCMs over some CORDEX domains, should motivate the consideration of statistical downscaling at the same level as RCMs by official providers of regional information, and its inclusion in reference sites. Nonetheless, further analysis on crucial aspects such as the impact on long-term trends or the sensitivity of different methods to being driven by Global Climate Models instead of by a reanalysis, is needed.

How to cite: Hernanz, A., Correa, C., Domínguez, M., Rodríguez-Guisado, E., and Rodríguez-Camino, E.: Can Statistical Downscaling based on Machine Learning compete with Regional Climate Models? A comparison for temperature, precipitation, wind, humidity and radiation over Europe under present conditions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1296, https://doi.org/10.5194/egusphere-egu23-1296, 2023.

EGU23-2260 | ECS | Orals | ITS1.8/AS5.5

Accurate heavy precipitation prediction in an (pre-)alpine area: The benefit of trend conservation in circulation pattern conditional statistical downscaling.

Brian Böker, Patrick Laux, Patrick Olschewski, and Harald Kunstmann

The reliable prediction of flash flood relevant heavy precipitation events under climate change conditions remains a challenging task for the downscaling community. Therefore, a huge variety of downscaling approaches have been proposed and successfully applied, however, there is still potential for improvements. The conducted study aims to investigate potential improvements by circulation pattern (CP) trends conservation and their utilization for CP conditional statistical downscaling of daily summer precipitation in the (pre-)alpine region of Bavaria. The CPs have been created taking only atmospheric variables into consideration and the link to precipitation is established via CP conditional cumulative distribution functions (CDF) of the observed precipitation at selected measurement sites across the region. The derived CDFs allow for the sampling of CP conditional precipitation values at the station scale which are subsequently bias corrected by quantile mapping (QM) and parametric transfer functions (PTFs) as tested methods. The predicted precipitation values have been evaluated against obervations using different performance measures such as Kling-Gupta Efficiency (KGE), Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). In order to properly account for extreme events the evaluation has been conducted for the complete precipitation distribution and for the distribution above the 95th percentile seperately. The results show that the described CP conditional downscaling approach is capable of yielding more accurate daily precipitation values especially in the extremes compartment in which an average gain in prediction skill of + 0.24 and a maximum gain of + 0.6 in terms of KGE has been observed. This shows that the conservation of trends and atmospheric information through CPs and their utilization for downscaling can lead to improved precipitation downscaling results.

How to cite: Böker, B., Laux, P., Olschewski, P., and Kunstmann, H.: Accurate heavy precipitation prediction in an (pre-)alpine area: The benefit of trend conservation in circulation pattern conditional statistical downscaling., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2260, https://doi.org/10.5194/egusphere-egu23-2260, 2023.

EGU23-2735 | Posters on site | ITS1.8/AS5.5

Spatial and Temporal Variability of Precipitation and Temperature: Analysis of Recent Changes and Future Development with Use of the Weather Generator and RCM-Based Climate Change Scenarios

Martin Dubrovský, Radan Huth, Petr Stepanek, Ondrej Lhotka, Jiri Miksovsky, and Jan Meitner

While much effort has been devoted to analyzing long-term changes of temperature and precipitation in mean values and extremes, studies on changes in variability have been rather scarce. Trends in variability are, however, important, among others because their interaction with trends in mean values determines the degree with which extremes would change. The knowledge of long-term changes in temporal variability is essential for assessments of climate change impacts on various sectors, including hydrology (floods and droughts), agriculture, health, and energy demand and production.

SPAGETTA is a stochastic spatial daily weather generator (WG), which uses first-order multivariate (dimension = number of variables X number of gridpoints) autoregressive model to represent the spatial and temporal variability of surface weather variables (including precipitation and temperature). We consider the generator to be a suitable tool for assessing changes in the spatial and temporal variability of the weather series because of following reasons: (A) The inter-gridpoint lag-0 and lag-1(day) correlations included in a set of WG parameters may serve as representatives for spatial and temporal variability of input weather variables. (B) Statistical significance of changes in the lag-0 and lag-1 correlations derived from the input series may be easily assessed by comparing the changes with a variability of the lag-0 and lag-1 correlations related to the stochasticity in input weather series (the variability is assessed across a set of multiple realisations of the synthetic series). (C) Separate effects of changes in various statistical characteristics on any climatic characteristic may be easily assessed. Specifically, having analysed changes in the means, variability and inter-gridpoint correlations (e.g. based on RCM simulations of the future climate), we may modify only a selected (possibly only a single one) WG parameter(s) before producing the synthetic series and analysing effect of climate change on the climatic characteristics.

In the first part of the contribution, we employ SPAGETTA generator to analyse changes in interdiurnal variability of precipitation and temperature in 8 European regions (defined in Dubrovsky et al 2020, Theor Appl Climatol) using (a) gridded observational (last N years vs. first N years in available E-OBS times series) and (b) RCM-simulated surface weather series (2070-2099 vs 1971-2000; outputs from 19 RCMs available from the CORDEX database are analysed). In doing this, we assess the statistical significance of the detected changes. In the second part, we assess separate effects of changes in the means, variability and lag-0 & lag-1 correlations of temperature and precipitation (the changes based on a set of 19 RCM simulations are used to modify the corresponding WG parameters) on a set of climatic indices - including a set of compound precipitation-temperature characteristics representing spells of days with spatially significant extent of significantly non-normal weather (e.g. hot-dry spells).

How to cite: Dubrovský, M., Huth, R., Stepanek, P., Lhotka, O., Miksovsky, J., and Meitner, J.: Spatial and Temporal Variability of Precipitation and Temperature: Analysis of Recent Changes and Future Development with Use of the Weather Generator and RCM-Based Climate Change Scenarios, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2735, https://doi.org/10.5194/egusphere-egu23-2735, 2023.

EGU23-3343 | Orals | ITS1.8/AS5.5 | Highlight

Conceptual development and use of downscaled climate model information

Robert Wilby and Christian Dawson

Statistical and dynamical downscaling techniques are widely applied in the development of local climate change scenarios. This talk traces the conceptual development of downscaling as a decision-support tool for climate risk assessment, resilience and adaptation planning. Four epochs are identified: (1) early exploration of local changes in key climate variables, such as temperature and precipitation extremes; (2) application of downscaled scenarios to climate impacts modelling (such as for agriculture yield or water resource assessments); (3) advent of ensemble-based methods and more sophisticated handling of uncertainty in the downscaling-impacts workflow; and (4) use of downscaled scenarios to stress-test adaptation options under plausible ranges of climate and non-climatic conditions. Each phase is illustrated by and reflected in the development of the Statistical DownScaling Model – Decision Centric (SDSM-DC) over more than two decades. Questions around fitness for purpose and appropriate uses of the tool are explored. The talk concludes by considering: where next for downscaling?

How to cite: Wilby, R. and Dawson, C.: Conceptual development and use of downscaled climate model information, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3343, https://doi.org/10.5194/egusphere-egu23-3343, 2023.

EGU23-3479 | Posters on site | ITS1.8/AS5.5

Numerical investigation with a coupled single-column surface-atmosphere model and an application to central Mediterranean

Stéphane Goyette and Jérôme Kasparian

An atmospheric single-column model (SCM) developed in the framework of the Canadian Regional Climate Model, CRCM, driven by NCEP-NCAR reanalyses is applied to study the non-linear interactions between the surface and the planetary boundary layer (Goyette et al., 2020). The approach to solve the model equations and the technique described may be implemented in any RCM system environment as a model option. The working hypothesis underlying this SCM formulation is that a substantial portion of the variability simulated in the column can be reproduced by processes operating in the vertical dimension and a lesser portion comes from processes operating in the horizontal dimension. This SCM offers interesting prospects as the horizontal and vertical resolution of the RCM is ever increasing. Due to its low computational cost, multiple simulations may be carried out in a short period of time. In this paper, a range of possible results obtained by changing the lower boundary from open water surface to land, and by varying model parameters are mainly shown for central Mediterranean but also for other applications. Results show that the model responded in a highly nonlinear but coherent manner in the lowest levels with changes in air temperature, moisture and windspeed profiles. The latter are consistent with those of the surface vertical sensible, latent heat and momentum fluxes. For example in the central Mediterranean, during a simulated year, air temperature is increased during all the seasons. Specific humidity is increased during the autumn and winter seasons but decreased by during the spring and summer seasons thus showing the contrasting influence of the land surface. The potential for further developments, as well as some guidance as to how to handle mixed land/open water coupling in RCMs, is also provided.

GOYETTE, Stéphane, FONSECA, Cédric, TRUSCELLO, Léonard. Assessment of nonlinear effects of a deep subgrid lake with an atmospheric single‐column model. In: International Journal of Climatology, 2020. doi: 10.1002/joc.6890

How to cite: Goyette, S. and Kasparian, J.: Numerical investigation with a coupled single-column surface-atmosphere model and an application to central Mediterranean, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3479, https://doi.org/10.5194/egusphere-egu23-3479, 2023.

EGU23-3741 | ECS | Orals | ITS1.8/AS5.5

Validation of MVT bias correction in dynamical downscaling simulations for climate extreme

Meng-Zhuo Zhang, Ying Han, Zhongfeng Xu, and Weidong Guo

Dynamical downscaling is a widely-used approach to generate regional projections of future climate extremes at a finer scale. Previous studies indicated that the global climate model (GCM) bias correction method prior to dynamical downscaling can improve the simulation of the climate extreme to a certain extent. Recently, a new bias correction method termed MVT was developed. Note that this method did not correct the GCM biases of the climate extreme event explicitly. In this study, we evaluate the MVT method in terms of various climate extreme events through three dynamical downscaling simulations over Asia-western North Pacific with 25 km grid spacing throughout 1980–2014, and further investigate to what extent and how this bias correction method can improve the simulation of downscaled climate extreme events. The dynamical downscaling simulations driven by the original GCM dataset derived from the MPI-ESM1-2-HR (hereafter WRF_GCM), the bias-corrected GCM (hereafter WRF_GCMbc) are validated against that driven by the European Centre for Medium-Range Weather Forecasts Reanalysis 5 dataset, respectively. The results suggest that compared with the WRF_GCM, the WRF_GCMbc shows more than 26% decrease in root mean square errors of the precipitation and temperature extreme indices, and even 61% out of seasonal extreme indices show more than 50% reduction. Such improvements in the WRF_GCMbc are primarily caused by the correct simulation of the large-scale circulation due to the GCM bias correction. The large-scale circulation in turn improves the simulation of the precipitation and cloud by the water vapor transport and further improves the simulation of the 2m temperature by the radiation process and the surface energy balance, which contribute to the better simulation of the precipitation and temperature extreme indices.

How to cite: Zhang, M.-Z., Han, Y., Xu, Z., and Guo, W.: Validation of MVT bias correction in dynamical downscaling simulations for climate extreme, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3741, https://doi.org/10.5194/egusphere-egu23-3741, 2023.

EGU23-3922 | ECS | Posters on site | ITS1.8/AS5.5

Scale Heterogeneity Avoided Dasymetric Mapping for the Gridded Population

Weipeng Lu and Qihao Weng

The gridded population, crucial for resource allocation and emergency support, is mainly downscaled from the census data with administrative divisions. A common dasymetric mapping approach is building a regression model between aggregated geospatial properties and population potential at the administrative level and then applying this model directly to the grid level. The aggregation of geospatial properties often relies on statistical methods like averaging. However, the difference in scale between the two levels can lead to the heterogeneity of geospatial properties, which causes a gap between the training domain and the target domain and makes these methods fail to preserve the physical meaning of geographic properties. To address this issue, we propose a deep learning-based approach, in which a sophisticated loss function involving tripartite elements, gridded geospatial properties, gridded population potential, and administrative population potential, is designed. In this way, scale heterogeneity both in aggregation and domains can be avoided. In this study, a 30-meter resolution population density map of Hong Kong is produced through the proposed approach. The validation result shows that compared with both the machine learning-based or the artificial neural network-based one, the proposed approach gets a lower RMSE and potentially provides a more accurate reference for detailed urban management.

How to cite: Lu, W. and Weng, Q.: Scale Heterogeneity Avoided Dasymetric Mapping for the Gridded Population, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3922, https://doi.org/10.5194/egusphere-egu23-3922, 2023.

EGU23-4495 | Orals | ITS1.8/AS5.5

Implications of statistical bias adjustment for uncertainties in regional model projections

Muralidhar Adakudlu, Elena Xoplaki, and Niklas Luther

Regional climate models, due to their systematic biases, are not usable for impact assessment and policy-relevant applications. It is common to post-process the regional model outputs with appropriate bias correction methodologies to provide reliable climate change information. We apply a distribution-based, trend-preserving quantile mapping procedure to bias correct the projections of daily precipitation and temperature from an ensemble of 5 RCMs driven by 5 GCMs, each at a resolution of 0.11°, chosen from the EURO-CORDEX initiative. The gridded observations from the German Weather Service, DWD-HYRAS, has been used as a reference for the bias correction. The impact of the bias correction is found to be more pronounced on precipitation than on temperature, as the precipitation biases are larger. The models are wetter and underestimate (overestimate) the daily maximum (minimum) temperature. The correction method eliminates large parts of these biases and maps the distributions of both the variables well with that of observations. The bias adjustment also leads to the narrowing down of the uncertainties in the projected changes of both the variables. The decomposition of total variance into model uncertainty and internal variability suggests that the bias correction acts mostly on the former component. The internal variability component does not seem, however, to undergo considerable changes following the bias correction. Due to the reduction of the uncertainty, we find a slight improvement in the signal-to-noise ratio in the projections.

How to cite: Adakudlu, M., Xoplaki, E., and Luther, N.: Implications of statistical bias adjustment for uncertainties in regional model projections, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4495, https://doi.org/10.5194/egusphere-egu23-4495, 2023.

EGU23-5321 | Posters on site | ITS1.8/AS5.5

Multi-model downscaling simulations of regional climate and air quality in China

Mengshi Cui

In this study, a multi-model ensemble of regional climate and air quality coupling model system was established to evaluate current climate and air pollution in China during 2010-2014. Meteorological initial and boundary conditions were obtained from the multi earth system models used in the Coupled Model Intercomparison Project Phase 6 (CMIP6) with a dynamical downscaling method and the National Centers for Environmental Prediction Final Analysis (NCEP-FNL) reanalysis data. These downscaling data under the historical scenario and FNL data were applied to driven the Weather Research and Forecasting model coupled to Chemistry (WRF-Chem) to simulate current climate and air quality. A comprehensive evaluation of the current five years was conducted against the ground-level meteorological and chemical observations. The performances for the 2 m temperature were very well and consistently overestimated the wind speed at 10 m by 0.8~1.2 m/s. PM_2.5 and ozone concentrations were underestimated by the downscaling data driven simulations compared with the FNL data. The model performance was relatively well and can be used to study the impacts of climate change on China's future air quality and pollution events in the context of carbon neutrality and clean air, which may shed light on policy formulation for medium and long-term air quality management and climate change alleviation.

How to cite: Cui, M.: Multi-model downscaling simulations of regional climate and air quality in China, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5321, https://doi.org/10.5194/egusphere-egu23-5321, 2023.

EGU23-6168 | ECS | Posters on site | ITS1.8/AS5.5

Simulating river discharges variations and flood events from large-scale atmospheric information with statistical and dynamical downscaling models: Example of the Upper Rhône River

Caroline Legrand, Bruno Wilhelm, and Benoît Hingray

Floods are highly destructive natural hazards causing widespread impacts on socio-ecosystems. This hazard could be further amplified with the ongoing climate change, which will likely alter magnitude and frequency of floods. Estimating how flood-rich periods could change in the future is however challenging. The classical approach is to estimate future changes in floods from hydrological simulations forced by time series scenarios of weather variables for different future climate scenarios. The development of relevant weather scenarios for this is often critical. To be adapted to the critical space and time scales of the considered basins, weather scenarios are thus typically produced from climate models with downscaling models, either dynamical or statistical.

In this study, we assessed the ability of two typical simulations chains to reproduce over the last century (1902-2009) and from large-scale atmospheric information only observed temporal variations of river discharges and flood events of the Upper Rhône River (10,900 km²). The modeling chains are made up of (i) the atmospheric reanalysis ERA-20C, (ii) either the statistical downscaling model SCAMP (Raynaud et al., 2020) or the dynamical downscaling model MAR (Gallée and Schayes, 1994), and (iii) the glacio-hydrological model GSM-SOCONT (Schaefli et al., 2005).

The daily Mean Areal Temperature (MAT) and Precipitation (MAP) time series were compared to the observed ones over the period 1961-2009. The meteorological results highlight the need for a bias-correction for both downscaling models. To avoid irrelevant simulations of the snowpack dynamics, especially for high elevations, the bias-correction was needed not only for the precipitation and temperature scenarios but also for the lapse scenarios of the dynamical downscaling chain. Simulated discharges are globally in very good agreement with the reference ones in the bias-corrected simulations. Whatever the river basin considered, the multi-scale observed variations of discharges are well reproduced (daily, seasonal and interannual). The reconstruction power of the chains is lower for low frequency hydrological situations, namely low flow sequences and annual discharge maxima. Flood events tend to be underestimated by each simulation chain.

Flood activity was also estimated from the discharge time series using the Peak Over Threshold (POT) method. The results over the last century are very promising, and encourage us to continue towards simulations over the last millennium. Outputs from the PMIP4 experiments (CESM1 Last Millennium Ensemble) will be statistically downscaled with the SCAMP model (for reasons of computation costs) and used as forcings in the GSM-SOCONT model.

References:
- Raynaud et al. (2020) HESS doi.org/10.5194/hess-24-4339-2020
- Gallée and Schayes, 1994 MWR doi:10.1175/1520-0493(1994)122<0671:DOATDM>2.0.CO;2
- Schaefli et al. (2005) HESS doi.org/10.5194/hess-9-95-2005

How to cite: Legrand, C., Wilhelm, B., and Hingray, B.: Simulating river discharges variations and flood events from large-scale atmospheric information with statistical and dynamical downscaling models: Example of the Upper Rhône River, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6168, https://doi.org/10.5194/egusphere-egu23-6168, 2023.

EGU23-6529 | ECS | Orals | ITS1.8/AS5.5

Sensitivity of bias adjustment methods to low-frequency internal climate variability over the reference period: an ideal model study

Remy Bonnet, Mathieu Vrac, Olivier Boucher, and Xia Jin

Climate simulations often need to be adjusted before carrying out climate impact studies at regional scale in order to reduce the biases often present in climate models. To do that, bias adjustment methods are usually applied to climate output simulations and are calibrated over a reference period. This period ideally includes good observational coverage and is often defined as the 2 or 3 more recent decades. However, on these timescales, the climate state may be influenced by the low-frequency internal climate variability. There is therefore a risk of introducing a bias to the climate projections by bias-adjusting simulations with low-frequency variability in a different phase to that of the observations. We proposed here a new pseudo-reality framework using an ensemble of simulations performed with the IPSL-CM6A-LR climate model in order to assess the impact of the low-frequency internal climate variability of the North Atlantic sea surface temperatures on bias-adjusted projections of mean and extreme surface temperature over Europe. We show that adjusting a simulation in a similar phase of the Atlantic Multidecadal Variability to that of the pseudo-observations reduces the pseudo-biases in temperature projections. Therefore, for models and regions where low frequency internal variability matters, it is recommended to sample relevant climate simulations to be bias adjusted in a model ensemble or alternatively to use a very long reference period when possible.

How to cite: Bonnet, R., Vrac, M., Boucher, O., and Jin, X.: Sensitivity of bias adjustment methods to low-frequency internal climate variability over the reference period: an ideal model study, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6529, https://doi.org/10.5194/egusphere-egu23-6529, 2023.

EGU23-7470 | Orals | ITS1.8/AS5.5

pyESD: An open-source Python framework for empirical-statistical downscaling of climate information

Sebastian G. Mutz and Daniel Boateng

The nature and severity of climate change impacts varies significantly from region to region. Consequently, high-resolution climate information is needed for meaningful impact assessments and the design of mitigation strategies. This demand has lead to an increase in the coupling of Empirical Statistical Downscaling (ESD) models to General Circulation Model (GCM) simulations of future climate. Here, we present a new open-source Python package (pyESD; github.com/Dan-Boat/PyESD) that implements several Perfect Prognosis ESD (PP-ESD) methods and the whole downscaling cycle. The latter includes routines for data preparation, predictor selection and construction, model selection and training, evaluation, utility tools for relevant statistical tests, visualisation, and more. The package includes a collection of well-established Machine Learning algorithms and allows the user to choose a variety of estimators, cross-validation schemes, objective function measures, hyperparameter optimization, etc., in relatively few lines of codes. The package is highly modular and flexible, and allows quick and reproducible downscaling of any climate information, such as precipitation, temperature, wind speed or even glacial retreat. We demonstrate the effectiveness of the new PP-ESD framework by generating station-based downscaling products of precipitation and temperature for complex mountainous terrain in Southwest Germany.

How to cite: Mutz, S. G. and Boateng, D.: pyESD: An open-source Python framework for empirical-statistical downscaling of climate information, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7470, https://doi.org/10.5194/egusphere-egu23-7470, 2023.

EGU23-8234 | ECS | Posters on site | ITS1.8/AS5.5 | Highlight

Weather reconstruction and application for Switzerland: Long-term changes of spring weather impacts since 1763

Imfeld Noemi and Brönnimann Stefan

Numerous historical sources report on hazardous past climate and weather events that had considerable impacts on society. Studying changes in their occurrence or mechanisms behind such events is however hampered by a lack of spatial weather information. For Switzerland, we created a daily high-resolution (1x1 km²) reconstruction of temperature and precipitation fields for the years 1763 to 1960 using an analog resampling method based on observational data. The resampled fields are further post-processed by assimilating temperature observations and quantile mapping the precipitation fields. Together with the present-day meteorological fields, this forms a more than 250-year long gridded data set.

We use this data set to evaluate changes in spring weather impacts over the last 250 years. The spring season receives fewer attention since it has no extreme events in absolute terms. However, it is relevant since weather conditions in spring can delay vegetation onset and growth, and can create substantial vegetation damages due to for example late frost and snow events. We evaluate therefore the long-term changes of spring fresh snow days, late frost days, frost days, and warm days, and compare it to changes of spring onset and reconstructed phenological stages.

How to cite: Noemi, I. and Stefan, B.: Weather reconstruction and application for Switzerland: Long-term changes of spring weather impacts since 1763, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8234, https://doi.org/10.5194/egusphere-egu23-8234, 2023.

EGU23-8829 | ECS | Posters on site | ITS1.8/AS5.5

A soil moisture downscaling playground of multiple resolution physics-based simulations

Elena Leonarduzzi and Reed M Maxwell

Knowing soil moisture conditions accurately is extremely important for natural hazards prediction, agriculture, and other water resources management practices. Remote sensing products have been used more and more in these contexts. Their main advantage is the spatial coverage, which allows one to obtain continental or even global products. Nevertheless, there are limitations associated with them, such as reduced penetrating depth, impact of cloudiness and snow/ice, and low spatial and temporal resolutions. To compensate for the low spatial resolution, downscaling techniques have been developed that combine different remote sensing products and/or other data considered to affect soil moisture redistribution. The main limitation in their development, is the lack of data to validate the techniques and the final product. Oftentimes in situ measurements are used for the calibration/training and for the testing/verification. These are very sparse, i.e., only available at few locations, and hard to compare directly, as both the satellite products and the downscaled estimates are volumetric and not point estimates.

Here, we create a soil moisture downscaling playground by generating soil moisture estimates with a physics-based hydrological model (ParFlow-CLM) at different resolutions, from a few kilometers to 100 meters. Having continuous gridded estimates of high- and low- resolution soil moisture with a reliable physics-based model, allows us to test and compare different downscaling techniques as well as the impact on the scaling of individual inputs/parameters. As an initial experiment, we model the East Taylor catchment (Colorado, USA) at 100m and 1000m resolution, by only changing the topography (i.e., all other inputs are resolved at 1000m), which is not only the best-known input even at high resolutions, but also the most impactful in soil moisture redistribution. The best performing downscaling technique will allow us, in an operational setup, to run the physics-based model at a coarser resolution but still have a high-resolution product in a computationally inexpensive manner. Beyond our application, the high- and low- resolution simulations generated in this work can be used for the validation of any downscaling technique also applicable with remote sensing products.

How to cite: Leonarduzzi, E. and Maxwell, R. M.: A soil moisture downscaling playground of multiple resolution physics-based simulations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8829, https://doi.org/10.5194/egusphere-egu23-8829, 2023.

EGU23-9644 | ECS | Orals | ITS1.8/AS5.5

Analysis of Ensemble Uncertainty Transfer in AI-Based Downscaling of C3S Seasonal Forecast

Qing Lin, Fatemeh Heidari, Edgar Fabián Espitia Sarmiento, Muralidhar Adakudlu, Andrea Toreti, and Elena Xoplaki

Copernicus Climate Change Service (C3S) integrates multiple seasonal forecast models of climate variables with multiple ensemble realizations. Assessing the risks of natural hazards with high impacts on human and natural systems and providing actionable services at the local scale require high-resolution predictions. We implement the AI-based approach proposed by Heidari et al. (2023) to address such needs and reach a kilometer scale. While downscaling seasonal forecasts, it is crucial to transfer the full range of the uncertainties given by the ensembles.

This study assesses how uncertainty is transferred by an AI-based downscaling approach. Quantile-based metrics are here used to measure the ensemble variability between seasonal forecasts and their downscaled products. On the other side, quantile-based metrics can also give an alternative description of the ensemble variabilities, which could replace the raw ensemble members in the downscaling process. In this study, the AI-downscaling system is tested by inputting (a) raw ensemble members and (b) quantile-based metrics. Transferred uncertainty and downscaling accuracy are then evaluated to develop and implement an optimal downscaling approach with hazard-dependent inputs being selected at regional and local scales.

Heidari F., Lin Q., Espitia Sarmiento E.F., Toreti A., and Xoplaki E. (2023): A deep learning technique to realistically bias correct and downscale seasonal forecast ensembles of climate variables towards the development of an AI-based early warning system, EGU 2023 abstract

How to cite: Lin, Q., Heidari, F., Espitia Sarmiento, E. F., Adakudlu, M., Toreti, A., and Xoplaki, E.: Analysis of Ensemble Uncertainty Transfer in AI-Based Downscaling of C3S Seasonal Forecast, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9644, https://doi.org/10.5194/egusphere-egu23-9644, 2023.

EGU23-10595 | Orals | ITS1.8/AS5.5

Globsim v.3 – Improvements to an open-source software library for utilizing atmospheric reanalyses in point-scale land surface simulation

Nick Brown, Stephan Gruber, and Bin Cao

The lack of long-term and consistent meteorological observations limits the application of land-surface simulators (e.g., of phenomena in hydrology, the cryosphere, ecology) at remote locations. For example, most permafrost areas are remote and lacking consistent meteorological time series, models that describe permafrost change over time cannot be driven for comparison with observations or for impact studies. Reanalysis-derived time series are valuable because they are available with global coverage, for a long time period, and for a broad set of physically consistent variables. Multiple reanalyses can be used to provide estimates of uncertainty. Practically, however, this data is difficult to use for several reasons: grid-scale reanalyses must be downscaled and interpolated horizontally (and vertically within the atmospheric column for mountains regions) to the site‑scale, differences in variables, units, and delivery between reanalyses must be reconciled, and large volumes of data need to be handled. Globsim is an open-source python library (available via GitHub) that was developed to handle these challenges and to facilitate a simulation workflow that takes advantage of the multiple reanalysis products available today. It outputs sub-daily meteorological time series that resemble meteorological stations for any location on the planet. Since the release of the first version of Globsim, we have improved usability, refactored code for maintainability and speed, and fixed a number of bugs. We also added support for ERA5 ensemble data, and added more sophisticated heuristic downscaling algorithms, including TOPOscale for elevation-adjusted radiative fluxes. We use Globsim as a core tool in a multi-model permafrost simulation workflow and, as a future step, we intend to use it as part of a debiasing routine to make predictions of permafrost using climate scenarios. We expect this tool to be broadly applicable to climate change impact modelers and other scientists using climate driven simulations working in (remote) locations that lack meteorological data of sufficient quality and duration for their application.

How to cite: Brown, N., Gruber, S., and Cao, B.: Globsim v.3 – Improvements to an open-source software library for utilizing atmospheric reanalyses in point-scale land surface simulation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10595, https://doi.org/10.5194/egusphere-egu23-10595, 2023.

EGU23-11124 | Orals | ITS1.8/AS5.5

Reducing negative impacts of bias adjustment on the distribution tail and extreme climate indicators in MIdAS

Peter Berg, Thomas Bosshard, Lars Bärring, Johan Södling, Renate Wilcke, Wei Yang, and Klaus Zimmermann

Bias adjustment of climate models is today normally performed with quantile mapping methods that account for the whole distribution of the parameter. The bulk of the distribution is well described as long as sufficient data records are used (Berg et al., 2012), however, the extreme tails will always suffer from large uncertainties. These uncertainties stem from both the climate model and the reference data set, which prevents a robust and detailed identification of bias in the extreme tail. Empirical quantile mapping methods are therefore prone to overfitting, and may introduce substantial bias when applied outside the calibration period. Commonly, a constant adjustment is applied for values outside the range of the calibration period, but there is room for improvements of the extrapolation method.

While working with a climate service for Sweden, a clear offset was identified between data adjusted within and outside the calibration period for an extreme indicator of daily maximum precipitation. This study explores different extrapolation methods for the extreme tail of the distribution in the spline-based empirical quantile mapping method of the MIdAS bias adjustment method (Berg et al., 2022). By limiting the bias adjustment to the first 95% of the distribution, and thereafter applying a constant or a linear fit to the remaining 5% of data in the tail, the offset is strongly reduced and the adjusted extremes become more robust and plausible.

Berg, P., Feldmann, H., & Panitz, H. J. (2012). Bias correction of high resolution regional climate model data. Journal of Hydrology, 448, 80-92.

Berg, P., Bosshard, T., Yang, W., & Zimmermann, K. (2022). MIdASv0. 2.1–MultI-scale bias AdjuStment. Geoscientific Model Development, 15(15), 6165-6180.

How to cite: Berg, P., Bosshard, T., Bärring, L., Södling, J., Wilcke, R., Yang, W., and Zimmermann, K.: Reducing negative impacts of bias adjustment on the distribution tail and extreme climate indicators in MIdAS, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11124, https://doi.org/10.5194/egusphere-egu23-11124, 2023.

EGU23-11595 | ECS | Orals | ITS1.8/AS5.5

Understanding the double-ITCZ problem over the Atlantic with bias-corrected downscaling

Shuchang Liu, Christian Zeman, and Christoph Schär

The long-existing double-ITCZ problem in GCMs affects not only the models' ability in simulating the current climate, but also implies limitations regarding the assessment of climate sensitivity and global climate change. Using a regional climate model (RCM) with explicit convection at a horizontal grid spacing of 12 km in a large computational domain covering the tropical and sub-tropical Atlantic, we develop a bias-correction downscaling methodology to remove the biases of a driving GCM. The methodology is related to the pseudo-global warming (PGW) approach. Normally this method is used to impress the climate-change signal to a reanalysis-driven RCM simulation, but it can also be used to modulate the lateral-boundary conditions of a GCM, such as to remove the large-scale biases. We show that the double ITCZ problem persists with classical dynamical downscaling (i.e. when driving the RCM directly by the GCM output), but with our bias-corrected downscaling the double ITCZ problem can be removed. Detailed analysis reveals that the main cause of the double ITCZ problem can be attributed to the GCMs' SST bias. Compared to the GCMs' AMIP simulations, RCMs with higher resolution allow explicit deep convection and enable a better simulation of tropical convection and clouds. By improving the corresponding radiative forcing, vertical motion is better simulated. Subsidence stronger to the south of the ITCZ pushes the ITCZ more north in the boreal spring, which is consistent with the observation of the ITCZ. The developed methodology provides an opportunity for better constraining climate sensitivity by removing double-ITCZ biases.

How to cite: Liu, S., Zeman, C., and Schär, C.: Understanding the double-ITCZ problem over the Atlantic with bias-corrected downscaling, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11595, https://doi.org/10.5194/egusphere-egu23-11595, 2023.

EGU23-12395 | Orals | ITS1.8/AS5.5

Development of an operational seasonal forecast in Colombia and Peru by mean of statistical downscaling of the SEAS5-Copernicus data

Gloria Rea, Daniele Galuzzo, and Marco Formenton

Enel, as most of the Energy Players, has an important exposure on weather risk due to the indirect effect of the power demand and to the direct effects on renewable production. A large component of such risk comes from the hydroelectric production, this is especially true in Southern America where, in some countries, it can represent up to 70% of the total production. We present a practical development of an operational chain to extract information from the seasonal forecasts produced by SEAS5. It works on some catchments in Colombia and Peru with the aim to provide an ensemble forecast of monthly precipitations at a high resolution from the fields at low resolution provided by Copernicus. To produce the high-resolution fields of precipitations we developed a procedure based on Lorenz et al. (2021); for our scope, the biases of the SEAS5 forecasts are corrected following a reference climatology obtained from the SEAS5 hindcasts that is calibrated over the cumulative distribution function calculated be mean of historical measurements of the IDEAM weather stations. The method and preliminary results as well as the validation will be shown in this work.

How to cite: Rea, G., Galuzzo, D., and Formenton, M.: Development of an operational seasonal forecast in Colombia and Peru by mean of statistical downscaling of the SEAS5-Copernicus data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12395, https://doi.org/10.5194/egusphere-egu23-12395, 2023.

EGU23-12899 | ECS | Orals | ITS1.8/AS5.5

A downscaling exercise for the Adriatic Sea in a perfect model approach

Renata Tatsch Eidt, Giorgia Verri, Vladimir Santos da Costa, Murat Gunduz, and Antonio Navarra

In this study, the predictability of the coastal ocean is assessed in a downscaling exercise for the Adriatic Sea using NEMO 3.6 over a 19 years’ time window (2001-2019). Inspired by the perfect model approach (Denis et al. 2002, De Elia et al. 2002) using a dynamical downscaling setup, a high resolution (2 km) experiment (Big Brother – BB) for the entire Adriatic Sea is used as the “true” reference for a smaller domain, downscaling experiment (Little Brother – LB) in the Northern Adriatic subbasin. The LB experiment has the same horizontal resolution as the BB (2 km) and is downscaled from a low resolution parent model (6 km), in a ratio of 1/3 resolution jump. The 2 km horizontal resolution fits the purpose of reaching an eddy-permitting grid spacing in the Adriatic basin (Masina and Pinardi, 1994; Cushman-Roisin et al. 2002).

Power spectral density analysis is used to evaluate the kinetic energy variance on the frequency domain among the experiments and compare them with the BB experiment. Overall, the LB is more energetic than the parent model, and the timing of the peaks of energy coincides with the ones of the BB. The energy on the 1 year signal is higher in the LB than the BB. The LB can recover a significant amount of energy for all peaks, with special attention to the 6 months period, which is poorly captured by the parent model. The 4 months signal is equally represented in BB and LB, while there is an underestimation of the 6 months signal of LB with respect to BB. Energy in the LB does not deviate from BB more than ~20% in the low frequencies and ~10% in the high frequencies, while the parent model presents in a whole lower energy than the BB, with higher differences on the low frequencies.

The Northern Adriatic circulation is largely influenced by the surface buoyancy flux and the wind forcing (Cessi et al., 2014), which play a significant role in the energy budget and the anti-estuarine overturning circulation of the Adriatic basin. Differences between LB and parent model results may be associated with the energy cascade due to interactions of internal dynamic processes which are differently represented at different resolutions. Differences between LB and BB results are the effect of the downscaling method and the horizontal resolution ratio between the parent model and the nested LB.

Moreover, the analysis of the wavenumber spectra allows a clear overview of the energy distribution in the space domain among the experiments and the representation of small-scale features in the LB. Small scale features less than twice the grid spacing (~12 km) are absent in the low-resolution parent model outputs. Therefore, the comparison with the true reference, BB, reveals the energy spectrum of the parent model solves only the larger scales, while the downscaling LB can recover the smaller scales absent in the initial and lateral boundary conditions.

How to cite: Tatsch Eidt, R., Verri, G., Santos da Costa, V., Gunduz, M., and Navarra, A.: A downscaling exercise for the Adriatic Sea in a perfect model approach, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12899, https://doi.org/10.5194/egusphere-egu23-12899, 2023.

EGU23-13876 | ECS | Posters on site | ITS1.8/AS5.5

Evaluating state-of-art statistical downscaling and analogs approaches on historical climate statistics over European regions

Daniele Peano, Lorenzo Sangelantoni, and Carmen Alvarez-Castro

Climate change impacts assessment crucially relies on climate information at high temporal and spatial resolutions, not available from global climate models (GCMs) involved in the coupled model intercomparison project (CMIP). At the same time, dynamically downscaled regional climate model simulations do not provide global-scale coverage and in several cases are computationally too expensive.

For this reason, downscaling techniques are commonly applied to bridge the resolution gap between GCM simulations and impact studies. The most common methodology is the statistical downscaling approach. However, statistical downscaling fast computation comes at a price, it does not account for physical and dynamic processes potentially inflates temporal variability of the original simulations’ resolution. Given this limitation, the analogs technique may represent a valuable alternative since it considers both large and local scales dynamics balanced by a reasonable increase in computational costs.

The present study explores differences, added value, and limitations characterizing state-of-the-art bias adjustment/statistical downscaling based on a stochastic quantile mapping approach and the analogs technique. In particular, the comparison applies to the data computed in the inter-sectoral impact model intercomparison project (ISIMIP) and data obtained by applying the analogs method based on the same ISIMIP reference dataset. The two approaches are compared and evaluated in terms of the historical period observed statistics reproduction for a few climate variables over European regions.

This study is performed in the framework of GoNEXUS and NEXOGENESIS European projects.

How to cite: Peano, D., Sangelantoni, L., and Alvarez-Castro, C.: Evaluating state-of-art statistical downscaling and analogs approaches on historical climate statistics over European regions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13876, https://doi.org/10.5194/egusphere-egu23-13876, 2023.

EGU23-14253 | ECS | Orals | ITS1.8/AS5.5 | Highlight

Downscaling with a machine learning-based emulator of a local-scale UK climate model

Henry Addison, Elizabeth Kendon, Suman Ravuri, Laurence Aitchison, and Peter Watson

High resolution rainfall projections are useful for planning for climate change [1] but are expensive to produce using physical simulations. We make novel use of a state-of-the-art generative machine learning (ML) method, diffusion models [2], to more cheaply generate high resolution (8.8km) daily mean rainfall samples over England and Wales conditioned on low resolution (60km) climate model variables. The downscaling model is trained on output from the Met Office UK convection-permitting model (CPM) [3]. We then apply it to predict high-resolution rainfall based on either coarsened CPM output or output from the Met Office HadGEM3 general circulation model (GCM). The downscaling model is stochastic and able to produce samples of high-resolution rainfall that have realistic spatial structure, which previous methods struggle to achieve. It is also easy to train and should better estimate the probability of extreme events compared to previous generative ML approaches.

The downscaling model samples match well the rainfall distribution of CPM simulation output. We use as our conditioning variables We obtained further improvements by also including high-resolution, location-specific parameters that are learnt during the ML training phase. We will discuss the challenges of applying the model trained on coarsened CPM variables to GCM variables and present results about the method’s ability to reproduce the spatial and temporal behaviour of rainfall and extreme events that are better represented in the CPM than the GCM due to the CPM’s ability to model atmospheric convection.

References

[1] Kendon, E. J. et al. (2021). Update to the UKCP Local (2.2km) projections. Science report, Met Office Hadley Centre, Exeter, UK. [Online]. Available: https://www.metoffice.gov.uk/pub/data/weather/uk/ukcp18/science-reports/ukcp18_local_update_report_2021.pdf

[2] Song, Y. et al. (2021). Score-Based Generative Modeling through Stochastic Differential Equations. ICLR.

[3] Met Office Hadley Centre. (2019). UKCP18 Local Projections at 2.2km Resolution for 1980-2080, Centre for Environmental Data Analysis. [Online]. Available: https://catalogue.ceda.ac.uk/uuid/d5822183143c4011a2bb304ee7c0baf7

How to cite: Addison, H., Kendon, E., Ravuri, S., Aitchison, L., and Watson, P.: Downscaling with a machine learning-based emulator of a local-scale UK climate model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14253, https://doi.org/10.5194/egusphere-egu23-14253, 2023.

EGU23-14254 | ECS | Orals | ITS1.8/AS5.5

Case studies in bias adjustment: addressing potential pitfalls through model comparison and evaluation using a new open-source python package

Fiona Spuler, Jakob Wessel, Chiara Cagnazzo, and Edward Comyn-Platt

Statistical bias adjustment is now common practice when using climate models for impact studies, prior to or in conjunction with downscaling methods. Examples of widely used methodologies include CDFt (Vrac et al. 2016), ISIMIP3BASD (Lange 2019) or equidistant CDF matching (Li et al. 2010). Though common practice, recent papers (Maraun et al. 2017) have found fundamental issues with statistical bias adjustment. When multivariate aspects are not evaluated, improper use of bias adjustment is not detected. Fundamental misspecifications of the climate model, such as the displacement of large-scale circulation, cannot be corrected. Furthermore, results are sensitive to internal climate variability over the reference period (Bonnet et al 2022). If applied, bias adjustment methods should therefore be evaluated carefully in multivariate aspects and targeted to the use-case at hand.

However, good practice in the evaluation and application of bias adjustment methods is inhibited by what we frame as practical issues. If at all, published bias adjustment methods are often published as individual software packages across different programming languages (mostly R and Python) that do not allow users to adapt aspects of the method, such as the fit distribution, to their use-case. Existing open-source software packages, such as ISIMIP3BASD or CDFt, often do not offer an evaluation framework that covers multivariate (spatial, temporal, multi-variable) aspects necessary to detect misuse of methods, or user-specific impact metrics. Several of these issues apply to downscaling similarly.

To address some of these practical issues, we developed the open-source software package ibicus in collaboration with ECWMF (available on PyPi, extensive documentation https://ibicus.readthedocs.io/en/latest/index.html, published under Apache 2.0 licence). The package implements eight peer-reviewed bias adjustment methods in a common framework. It also includes an extensive evaluation framework covering multivariate aspects as well as the ETCCDI climate indices. The package thereby contributes to enhanced flexibility and ease-of-use of better evaluation practises in bias adjustment.

Our contribution presents three case studies using ibicus, highlighting a number of pitfalls in the usage of bias adjustment for climate impact modelling, and shows possible ways to address these issues. We investigate extreme indices of precipitation and compound extreme temperature-precipitation indices, modification of the climate change trend, and dry spell length as an example of a temporal index, over northern Spain and Turkey.

We evaluate how bias adjustment adds to the ‘cascade of uncertainty’ and how this can be made transparent in the different use-cases. We also demonstrate how some of the fundamental issues that can arise when applying bias adjustment can be detected and how evaluation of spatial and temporal aspects such as dry spell length can be made specific to the use-case at hand to detect improper use of bias adjustment. Lastly, we demonstrate how the ‘best’ bias adjustment method may depend on the metric of interest, and therefore a user-centric design of comparison and evaluation methods is necessary.

How to cite: Spuler, F., Wessel, J., Cagnazzo, C., and Comyn-Platt, E.: Case studies in bias adjustment: addressing potential pitfalls through model comparison and evaluation using a new open-source python package, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14254, https://doi.org/10.5194/egusphere-egu23-14254, 2023.

EGU23-15537 | ECS | Posters virtual | ITS1.8/AS5.5

Analysis of the statistical bias correction of ERA5-Land on different time aggregations in Trentino-Alto Adige

Andrea Menapace, Pranav Dhawan, Daniele Dalla Torre, Michele Larcher, and Maurizio Righetti

Global and regional climate models are constantly improving the quality of their outputs with increasingly fine spatial and temporal resolutions. These products, which comprise, for instance, reanalysis, reforecast and forecast, can be used for several applications, such as boundary conditions for climate simulations, initial conditions for local weather forecasting, and reference datasets for environmental and energy uses. Nevertheless, many authors have pointed out that such climate models are not suitable for direct use in local applications due to the presence of biases between the model results and the metered data. At this aim, several statistical methodologies have been proposed to correct and downscale the climate models outputs and make it available also for local purposes. Therefore, the purpose of this contribution is to analyse the current state-of-the-art statistical bias correction methods on different time aggregation to assess the capabilities of these methods from monthly to hourly temporal scale.

This study is carried out on the Trentino- Alto Adige, which is an alpine region in north Italy equipped with several measuring weather stations, around 300. The temperature and precipitation observations have been then used to produce a reference dataset through the geostatistical interpolation method called kriging. Instead, ERA5-Land, the reanalysis of ECMWF, has been adopted for the bias correction analysis. Several methods have been tested comprising of univariate and multivariate method including: linear scaling, variance scaling, local intensity scaling, local power transformation, quantile mapping, quantile delta mapping, and multivariate bias correction methods such as MBCn, MBCp, and MBCr. The time scale investigated are monthly, daily and hourly aggregations.

The results show a general decreasing of the performance of all the bias correction methods with the increase in the time-frequency of the weather variables. In particular, the mean absolute error of the corrected daily temperature is 50% larger than the monthly one, and the same 50% increase in error is found between daily and hourly corrected data. The increase in error with decreasing temporal resolution is even more pronounced for the precipitation variable, which is known to be discontinuous with respect to temperature. Multivariate bias correction methods seem to have difficulty maintaining dependencies between variables in the case of high-frequency data.

Although the results on the hourly data are not so scarce, it is evident that more depth analysis of temporal high-resolution climate data is needed, including sub-hourly data in the future, and therefore become crucial to develop new methodologies capable of correcting sub-daily bias. In conclusion, with this work, the authors seek to support research in the direction of providing high-frequency weather data for local applications, which are crucial, for example, in hydrological simulations for the assessment of hydrogeological risks and the management of renewable energy in the electricity market.

How to cite: Menapace, A., Dhawan, P., Dalla Torre, D., Larcher, M., and Righetti, M.: Analysis of the statistical bias correction of ERA5-Land on different time aggregations in Trentino-Alto Adige, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15537, https://doi.org/10.5194/egusphere-egu23-15537, 2023.

EGU23-17077 | ECS | Orals | ITS1.8/AS5.5

Extending A Posteriori Random Forests for Multivariate Statistical Downscaling of Climate Change Projections

Mikel N. Legasa, Soulivanh Thao, Mathieu Vrac, Ana Casanueva, and Rodrigo Manzanas

Under the perfect prognosis approach, statistical downscaling (SD, Gutiérrez et al., 2019) methods aim to learn the relationships between large-scale variables from reanalysis and local observational records. Typically, these statistical relationships, which can be learnt employing many different statistical and machine learning models, are subsequently applied to downscale future global climate model (GCM) simulations, obtaining local projections for the region and variables of interest.

A posteriori random forests (APRFs) were introduced in a recent paper (Legasa et al., 2021) for precipitation downscaling, but can be potentially used to estimate any probabilitydistribution. While performing similarly to other state-of-the-art machine learning methodologies like convolutional neural networks in terms of predictive performance (as measured in terms of correlation of the downscaled series with the observed series), APRFs produce less biased simulations, as measured by several distributional indicators.Furthermore, climate change signals projected by APRFs are consistent with those given by the raw GCM outputs, thus proving suitable for downscaling local climate change scenarios (Legasa et al. 2023, in review). Moreover, they also automatically select the most adequate large-scale variables and geographical domain of interest, a time-consuming task and potential source of uncertainty (Manzanas et al. 2020) when downscaling climate change projections.

In this work we show how the APRF methodology can be easily extended to more complex and multivariate distributions. One of the proposed extensions is temporal APRFs, which explicitly model the transition in time for a variable and location of interest (e.g. the rainfall probability conditioned to the dry/wet state of the previous day), thus improving the temporal consistency of the downscaled series in terms of several temporal (e.g. spells) indicators. Other possible extensions within the APRF framework include predicting the joint probability distribution of several geographical locations, thus improving the spatial consistency of the downscaled series; and modeling the multivariate joint distribution of different meteorological variables (e.g. precipitation, humidity and temperature).

References

Gutiérrez, J.M., Maraun, D., Widmann, M. et al. An intercomparison of a large ensemble of statistical downscaling methods over Europe: Results from the VALUE perfect predictor cross-validation experiment. Int. J. Climatol. 2019; 39: 3750– 3785. doi: https://doi.org/10.1002/joc.5462

Legasa, M. N., Manzanas, R., Calviño, A., & Gutiérrez, J. M. (2022). A posteriori random forests for stochastic downscaling of precipitation by predicting probability distributions. Water Resources Research, 58 (4), e2021WR030272. doi: https://doi.org/10.1029/2021WR030272

Legasa, M. N., Thao, S., Vrac, M., & Manzanas, R. (2023). Assessing Three Perfect Prognosis Methods for Statistical Downscaling of Climate Change Precipitation Scenarios. Submitted to Geophysical Research Letters.

Manzanas, R., Fiwa, L., Vanya, C. et al. Statistical downscaling or bias adjustment? A case study involving implausible climate change projections of precipitation in Malawi. Climatic Change 162, 1437-1453 (2020). doi: https://doi.org/10.1007/s10584-020-02867-3

How to cite: Legasa, M. N., Thao, S., Vrac, M., Casanueva, A., and Manzanas, R.: Extending A Posteriori Random Forests for Multivariate Statistical Downscaling of Climate Change Projections, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17077, https://doi.org/10.5194/egusphere-egu23-17077, 2023.

EGU23-151 | ECS | Posters on site | GI6.1

Optimal resolution of UAV-based digital elevation models (DEMs) for mapping of selected subtle glacial landforms

Szymon Śledź and Marek Ewertowski

Geomorphological mapping is one of the primary research methods used to collect data on glacial landforms and reconstruct glaciological processes. The most common approach is a combination of field-based and remote mapping using data obtained from various sensors. However, one of the crucial methodical problems is collecting remote sensing data in the appropriate spatial resolution for the analyzed landform, which directly affects the data collection time and costs. This study aims to find the optimal resolution of digital elevation models (DEMs) to map subtle glacial landforms: kame terraces, eskers, flutes, and push moraine. Such landforms contain valuable information about the glacial process–form relationships, however, are often too subtle to be recognized on satellite data, and therefore more detailed data (e.g., UAV-based) are required. By “optimal”, we mean the resolution high enough to enable recognition of the landforms mentioned above, and at the same time, as low as possible to minimize the time spent on data collection during the fieldwork.

To find out the optimal resolution, we used detailed (0.02 – 0.04 m ground sampling distance [GSD]) DEMs of the glacier forelands in Iceland (Kvíárjökull, Fjallsjökull and Svinafellsjökull), created based high-resolution images from an unmanned aerial vehicle (UAV). The DEMs were resampled to 0.05, 0.10, 0.15, 0.20, 0.30, 0.40, 0.50, 1.00 and 2.00 m GSD and selected glacial landforms were mapped independently by two operators and cross-checked. The results indicate that 2.0 m resolution is insufficient to properly recognize landforms such as pushed moraines or flutes; however, it can be sufficient to detect kame terraces and major glacifluvial channels. For general mapping of locations of forms such as annual pushed moraines or fluting, the 0.5 m resolution is required. However, to obtain geomorphometric characteristics of the landforms (e.g., height, width, volume) resolution between 0.1 and 0.2 m is necessary. Finer resolution (better than 0.05 m GSD) does not increase the ability to detect landforms or better characterize their geometric properties; however, in some cases might be useful to obtain information about clast characteristics. The experiment proved that decimeter-scale spatial resolution is sufficient for mapping of some geomorphological forms (annual pushed moraines, flutes), which allows for planning UAV missions at a higher elevation above the ground and, therefore, minmizing the duration of field surveys. Moreover, some of the more prominent landforms (e.g., kame terraces, larger moraines) can be successfully detected from aerial or satellite-based DEMs (e.g. freely available ArcticDEM) with a resolution of 2.00 m, the use of which reduces the costs of field research to a minimum.

This research was funded by the National Science Centre, Poland, Grant Number 2019/35/B/ST10/03928.

How to cite: Śledź, S. and Ewertowski, M.: Optimal resolution of UAV-based digital elevation models (DEMs) for mapping of selected subtle glacial landforms, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-151, https://doi.org/10.5194/egusphere-egu23-151, 2023.

EGU23-3292 | Posters on site | GI6.1

CO2 concentration and stable isotope surveys in the ambient air of populated areas of La Palma (Canary Islands) by means of mobile Delta Ray measurements using an electrical car

Nemesio M. Pérez, María Asensio-Ramos, José Barrancos, Eleazar Padrón, Gladys V. Melián, Fátima Rodríguez, Germán D. Padilla, Violeta T. Albertos, Pedro A. Hernández, Antonio J. Álvarez Díaz, Héctor de los Ríos Díaz, David Afonso Falcón, and Juan Cutillas

Anomalous CO₂ degassing of volcanic origin was observed by the end of November 2021 in the neighborhoods of La Bombilla and Puerto Naos, located in the western flank of La Palma, about 5 km distance southwestern of the 2021 Tajogaite eruption vents (Hernández et al., 2021). In this study zone, continuous monitoring of CO₂ concentration in the outdoors ambient air at 200 cm from the surface has reached a daily average of maximum and mean values about 28,000 and 10,000 ppm, respectively. We started recently to perform CO₂ concentration and stable isotope surveys in the outdoors ambient air of Puerto Naos at 140 cm from the surface by means of a Delta Ray analyzer installed in an electrical car which was driving through the streets of Puerto Naos. This instrument is a high performance, mid-infrared laser-based, isotope ratio infrared spectrometer (IRIS) which offers the possibility of performing simultaneous determination of δ¹³C and δ¹⁸O in CO₂ at ambient concentrations with a precision as low as 0.05‰. One major advantage of IRIS techniques with respect to more traditional ones (e.g., isotopic ratio mass spectrometry -IRMS-) is the possibility to perform (semi)continuous measurements at high temporal resolution. Since October 2022, seven surveys have been performed at Puerto Naos making up a total of about 600 measurements. The observed CO₂ concentrations and the δ¹³C-CO₂ values in the outdoors ambient air ranged from 420 to 3,500 ppm and from -9.0 to -3.2 ‰ vs. VPDB, respectively. Survey data analysis showed a good spatial correlation between relatively high CO₂ concentrations with δ¹³C-CO₂ values less ¹³C-depleted (i.e., volcanic CO₂). These observations highlight that stable isotope surveys allow to evaluate the impact of volcanic degassing on the air CO₂ concentration and provide valuable results to identify the volcanic CO₂ gas hazard zones.

Hernández, P. A., Padrón, E., Melián, G. V., Pérez, N. M., Padilla, G., Asensio-Ramos, M., Di Nardo, D., Barrancos, J., Pacheco, J. M., and Smit, M.: Gas hazard assessment at Puerto Naos and La Bombilla inhabited areas, Cumbre Vieja volcano, La Palma, Canary Islands, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7705, https://doi.org/10.5194/egusphere-egu22-7705, 2022.

How to cite: Pérez, N. M., Asensio-Ramos, M., Barrancos, J., Padrón, E., Melián, G. V., Rodríguez, F., Padilla, G. D., Albertos, V. T., Hernández, P. A., Álvarez Díaz, A. J., de los Ríos Díaz, H., Afonso Falcón, D., and Cutillas, J.: CO2 concentration and stable isotope surveys in the ambient air of populated areas of La Palma (Canary Islands) by means of mobile Delta Ray measurements using an electrical car, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3292, https://doi.org/10.5194/egusphere-egu23-3292, 2023.

EGU23-3620 | ECS | Posters on site | GI6.1

SO2 emissions during the post-eruptive phase of the Tajogaite eruption (La Palma, Canary Islands) by means of ground-based miniDOAS measurements in transverse mode using a car and UAV

Oscar Rodríguez, José Barrancos, Juan Cutillas, Victor Ortega, Pedro A. Hernández, Iván Cabrera, and Nemesio M. Pérez

Throughout the 85 days that lasted the Tajogaite eruption at Cumbre Vieja volcano (La Palma, Canary Islands, Spain), observations of SO₂ emissions were made using ground-based instruments, in transverse mode, static scanners and on-board drones, as well as by numerous satellite instruments. The initial estimates of the total SO₂ emission from the eruption were 2.4 Mt from TROPOMI and 1.2 Mt from the traverse data. These measurements formed part of the official monitoring effort, providing insights into the eruption’s evolution and informing the civil defence response throughout the eruption (Hayer C. et al., 2022; Albertos V. T. et al., 2022). Once the Tajogaite eruption was over, we continued performing a SO₂ monitoring release to the atmosphere by the Tajogaite volcanic vent since the low ambient concentrations of SO₂ make it an ideal volcanic gas monitoring candidate even during the post-eruptive phase. SO₂measurements had been carried out a using a car-mounted and UAV-mounted ground-based miniDOAS measurements throughout this post-eruptive phase. About 80 measurements of SO₂ emission rates were performed from December 15, 2021 to December 17, 2022. The standard deviation of the estimated values obtained daily was ~ 20%. The range of estimated SO₂ emission values has been from 670 to 17 tons per day, observing a clear decreasing trend of SO₂ emissions during the post-eruptive phase. During the first month of the post-eruptive phase, it was observed that the average value of the estimated SO₂ emission was about 219 tons/day, while it dropped to 107 tons/day during the second and third month after the end of the Tajogaite eruption. This average value continued decreasing during the fourth month of the post-eruptive phase, about 67 tons/day, and recently measurements provide an average SO₂emission value of 13 tons/day. These relatively low observed SO₂ emissions during the post eruptive of the Tajogaite eruption phase seems to be clearly related to shallow magma cooling processes within the Tajogaite volcanic edificie.

Hayer, C., Barrancos, J., Burton, M., Rodríguez, F., Esse, B., Hernández, P., Melián, G., Padrón, E., Asensio-Ramos, M., and Pérez, N.: From up above to down below: Comparison of satellite- and ground-based observations of SO₂ emissions from the 2021 eruption of Cumbre Vieja, La Palma, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-12201, https://doi.org/10.5194/egusphere-egu22-12201, 2022.

Albertos, V. T., Recio, G., Alonso, M., Amonte, C., Rodríguez, F., Rodríguez, C., Pitti, L., Leal, V., Cervigón, G., González, J., Przeor, M., Santana-León, J. M., Barrancos, J., Hernández, P. A., Padilla, G. D., Melián, G. V., Padrón, E., Asensio-Ramos, M., and Pérez, N. M.: Sulphur dioxide (SO₂) emissions by means of miniDOAS measurements during the 2021 eruption of Cumbre Vieja volcano, La Palma, Canary Islands, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5603, https://doi.org/10.5194/egusphere-egu22-5603, 2022.

How to cite: Rodríguez, O., Barrancos, J., Cutillas, J., Ortega, V., Hernández, P. A., Cabrera, I., and Pérez, N. M.: SO2 emissions during the post-eruptive phase of the Tajogaite eruption (La Palma, Canary Islands) by means of ground-based miniDOAS measurements in transverse mode using a car and UAV, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3620, https://doi.org/10.5194/egusphere-egu23-3620, 2023.

EGU23-3819 | Posters virtual | GI6.1

Using tunable diode laser (TDL) system in urban environments to measure anomalous CO2 concentrations: the case of Puerto Naos, La Palma, Canary Islands

José Barrancos, Germán D. Padilla, Gladys V. Melián, Fátima Rodríguez, María Asensio-Ramos, Eleazar Padrón, Pedro A. Hernández, Jon Vilches Sarasate, and Nemesio M. Pérez

Carbon dioxide (CO₂) is a colourless and odourless gas. It is non-flammable, chemically non-reactive and 1.5 times as heavy as air; therefore, may accumulate at low elevations. CO₂ is a toxic gas at high concentration, as well as an asphyxiant gas (due to reduction in oxygen). Irritation of the eyes, nose and throat occurs only at high concentrations. Since the Tajogaite eruption ended on December 13, 2021, high concentrations of CO₂ up to 20% (200.000 ppmv) have been observed inside of buildings of the neighborhoods of La Bombilla and Puerto Naos (La Palma, Canary Islands), which are located about 5 km distance from the Tajogaite eruption vent. Anomalous concentrations of CO₂ are manily detected in the ground-floor and basement of the buildings in Puerto Naos, and the distribution of relatively high CO₂ concentrations is not homogeneous or uniform throughout the Puerto Naos area (Hernández P.A. et al, 2022).

The purpose of this study was to use the Tunable Laser Diode (TDL) absorption spectroscopy method to monitor the indoor CO₂ concentration of the ground-floor of one of the buildings of Puerto Naos. A CO₂-TDL was installed on 9 January 2022 and continues measuring the CO₂ concentration along an optical path of about 6 meters. During the period January-March 2022, daily averages of CO₂ concentrations from fifteen-minute data ranged from 5000 to 25000 ppmv reaching values up to 40000 ppmv (4%). Over time, a clear decreasing trend of the indoor CO₂ concentration has been observed at this observation site and the daily CO₂ averages from fifteen-minute data during the last 3 months (October-December 2022) ranged from 1000 to 2500 ppmv. This clear decreasing trend over time has not been observed at other observation sites where the concentration of CO₂ inside buildings is being monitored. This observation indicates the complexity of the problem and the need to install a dense network of sensors to monitor CO₂ for civil protection purposes.

Hernández, P. A., Padrón, E., Melián, G. V., Pérez, N. M., Padilla, G., Asensio-Ramos, M., Di Nardo, D., Barrancos, J., Pacheco, J. M., and Smit, M.: Gas hazard assessment at Puerto Naos and La Bombilla inhabited areas, Cumbre Vieja volcano, La Palma, Canary Islands, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7705, https://doi.org/10.5194/egusphere-egu22-7705, 2022.

How to cite: Barrancos, J., Padilla, G. D., Melián, G. V., Rodríguez, F., Asensio-Ramos, M., Padrón, E., Hernández, P. A., Vilches Sarasate, J., and Pérez, N. M.: Using tunable diode laser (TDL) system in urban environments to measure anomalous CO2 concentrations: the case of Puerto Naos, La Palma, Canary Islands, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3819, https://doi.org/10.5194/egusphere-egu23-3819, 2023.

EGU23-3834 | Posters on site | GI6.1

Modeling outdoor dispersion of CO2 at Puerto Naos (La Palma, Canary Islands)

Luca D Auria, Alba Santos, Pedro A. Hernández, Gladys V. Melián, Antonio J. Álvarez Díaz, María Asensio-Ramos, Alexis M. González Pérez, and Nemesio M. Pérez

The 2021 Tajogaite eruption in Cumbre Vieja volcano (La Palma, Canary Islands), which started on Sep. 19, 2021, and lasted 85 days, caused extensive damages because of the lava flows and ash fall. However, since the middle of Nov. 2021, some areas located about 5 km SW of the eruptive center started to be affected by intense diffuse CO₂ emission. Among them are the urban centers of La Bombilla and Puerto Naos (Hernández et al., 2022). These emissions prevented the population of these two centers from returning to their houses because of high concentrations of CO₂ in indoor and outdoor environments.

In this work, we model the CO₂ dispersion process in Puerto Naos to obtain hazard maps with the maximum CO₂ concentrations which can be reached in the town in the outdoor environment. To achieve these results, we combined field observations with numerical modelling. Field surveys were realized in low wind conditions, measuring the CO₂ concentration with portable sensors at 15 and 150 cm from the ground at measurement points spaced approximately 10 m from each other along the streets of Puerto Naos.

We realized numerical modelling using the software TWODEE-2, a code for modeling the dispersion of heavy gases based on the solution of shallow water equations (Folch et al., 2009). For this purpose, we used a detailed digital topographic model, including the edifices of Puerto Naos. Using a trial-and-error approach, we determined the gas emission rates from a set of discrete source points in no-wind conditions. Subsequently, we repeated the numerical modelling, keeping the same sources and simulating all the realistic wind conditions in terms of direction and intensity. For each simulation, we determined the maximum CO₂ concentration at different elevations from the ground. This allowed obtaining a hazard map with the maximum CO₂ outdoor concentrations for each part of the town

The main results highlight that the outdoor environment is affected by a dense layer of CO₂, whose flow is strongly conditioned by the urban infrastructures. Furthermore, we evidenced how even light winds can change the gas concentration pattern radically in a few minutes, evidencing the possibility of sudden changes in the CO₂ concentration outdoors with no warning.

Folch A., Costa A., Hankin R.K.S., 2009. TWODEE-2: A shallow layer model for dense gas dispersion on complex topography, Comput. Geosci., doi:10.1016/j.cageo.2007.12.017

Hernández, P. A., Padrón, E., Melián, G. V., Pérez, N. M., Padilla, G., Asensio-Ramos, M., Di Nardo, D., Barrancos, J., Pacheco, J. M., and Smit, M.: Gas hazard assessment at Puerto Naos and La Bombilla inhabited areas, Cumbre Vieja volcano, La Palma, Canary Islands, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7705, https://doi.org/10.5194/egusphere-egu22-7705, 2022.

How to cite: D Auria, L., Santos, A., Hernández, P. A., Melián, G. V., Álvarez Díaz, A. J., Asensio-Ramos, M., González Pérez, A. M., and Pérez, N. M.: Modeling outdoor dispersion of CO2 at Puerto Naos (La Palma, Canary Islands), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3834, https://doi.org/10.5194/egusphere-egu23-3834, 2023.

EGU23-5223 | Orals | GI6.1

Event-oriented observation across scales and environmental systems: MOSES started operation.

Ute Weber and Claudia Schuetze and the MOSES-Team

The novel observing system „Modular Observation Solutions for Earth Systems (MOSES)“, is an initiative of the Helmholtz Association of German Research Centers that aims at investigating the interactions of short-term events and long-term trends across environmental systems. MOSES is a mobile and modular infrastructure and its component measuring systems are managed by the participating research centers. By quantifying energy, water, nutrient and greenhouse gas states and fluxes during events such as heat waves, droughts, heavy precipitation, floods, rapid thaw of permafrost or of ocean eddies, and subsequently along the related event chains, the system delivers data to examine potential long-term impacts of these events and to gain a better understanding of extreme events that are expected to increase in frequency and intensity in a changing climate. In order to obtain comprehensive data sets, a cross-system approach is followed, covering the atmosphere, land surface and hydrosphere. These event-related data sets complement long-term and/or large scale data sets of established national and international monitoring programs and satellite data such as TERENO, ICOS, eLTER, SENTINEL, etc. After a 5-year setup period, MOSES was successfully put into operation in 2022 (Weber et al., 2022, https://doi.org/10.1175/BAMS-D-20-0158.1).

While long-term trends are typically assessed with stationary observation networks and platforms specifically designed for long-term monitoring, proven event-oriented observation systems and strategies are still missing. Event-oriented observation campaigns require a combination of a) measuring systems that can be rapidly deployed at “hot spots” and in “hot moments”, b) mobile equipment to monitor spatial dynamics in high-resolution, c) in situ measuring systems to record temporal dynamics in high-resolution, and d) interoperable measuring systems to monitor the interactions between atmosphere, land surface and hydrosphere. We will present the observation system and the observing strategy on examples from two past test campaigns: 1) The “Swabian MOSES campaign” of 2021 that captured the formation and evolution of supercells, hail and heavy precipitation and the resulting local flash floods (Kunz et al., 2022, https://doi.org/10.3389/feart.2022.999593). 2) The MOSES campaign of 2019 that captured the historical low flow situation along the Elbe River and into the German Bight (e.g., Kamjunke et al., 2021, https://doi.org/10.1002/lno.11778). As an outlook, upcoming national and international campaigns and potential future deployments will be presented.

How to cite: Weber, U. and Schuetze, C. and the MOSES-Team: Event-oriented observation across scales and environmental systems: MOSES started operation., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5223, https://doi.org/10.5194/egusphere-egu23-5223, 2023.

EGU23-5684 | ECS | Posters on site | GI6.1

Random Forest Classification of Proterozoic and Paleozoic rock types of Tsagaan-uul area, Mongolia

Munkhsuren Badrakh, Narantsetseg Tserendash, Erdenejargal Choindonjamts, and Gáspár Albert

The Tsagaan-uul area of the Khatanbulag ancient massif in the Central Asian Orogenic Belt is located in the southern part of Mongolia, which belongs to the Gobi Desert. It has a low vegetation cover, and because of this, remotely sensed data can be used without difficulty for geological investigations. Factors such as sparse population and underdeveloped infrastructure in the region further create a need for combining traditional geological mapping with remote sensing technologies. In existing geology maps of the area, the formations are lithologically very diverse and their boundaries were mapped variously, so a need for a more precise lithology-based map arouse.

This study investigated combinations of fieldwork, multispectral data, and petrography for the rock type classification. A random forest classification method using multispectral Sentinel-2A data was employed in order to distinguish different rocks within Proterozoic Khulstai (NP₁hl) metamorphic complex, which is dominated by gneiss, andesite, sandstone, limestone, amphibolite, as well as the Silurian terrigenous-carbonate Khukh morit (S₁hm) formation, Tsagaan-uul area. Based on the ground samples collected from field surveys, ten kinds of rock units plus Quaternary sediments were chosen as training areas. In addition, morphometric parameters derived from SRTM data and band ratios used for iron-bearing minerals from Sentinel 2 bands are selected as variables in the accuracy of classification. The result showed that gneisses were recognized with the highest accuracy in the Khulstai complex, and limestones and Quaternary sediments were also well predicted. Moreover, the tectonic pattern was also well recognized from the results and compared to the existing maps provided a more detailed geological image of the area. This study emphasized the need for samples as baseline data to improve the machine learning methods, and the method provides an appropriate basis for fieldwork.

How to cite: Badrakh, M., Tserendash, N., Choindonjamts, E., and Albert, G.: Random Forest Classification of Proterozoic and Paleozoic rock types of Tsagaan-uul area, Mongolia, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5684, https://doi.org/10.5194/egusphere-egu23-5684, 2023.

EGU23-5689 | Posters on site | GI6.1

Post-earthquake geoenvironmental changes in landslide-affected watersheds in Atsuma, Hokkaido (Japan)

Yuichi S. Hayakawa, Tennyson Lo, Azim Zulhilmi, Xinyue Yu, and Xiaoxiao Wang

Following drastic changes in geoenvironmental components by coseismic landslides in mountainous watersheds, more gradual changes can be observed in the elements, including bare-land surface conditions, sediment connectivity, and vegetation recovery on sloping terrains. Such geoenvironmental changes may continue for years to decades, with complex interrelationships among various geomorphological and ecological factors. Their assessments are also crucial for local to regional environmental management. After the occurrence of numerous coseismic landslides triggered by the 2018 Hokkaido Eastern Iburi Earthquake in northern Japan, geomorphological and geoecological changes were explored using optical and laser sensors on uncrewed aerial systems. Morphological characteristics of the landslide-affected slopes in the watersheds were assessed with structure-from-motion multi-view stereo photogrammetry and light detection and ranging topographic datasets, while vegetation recovery on the slopes was examined with visible-light and near-infrared images. Although spatial relationships among morphological developments, sediment mobility, and vegetation recovery were not clearly observed, their general temporal trends may be correspondent. Dominant processes affecting the morphological developments are supposed to be frost heave in the cold climate and non-frequent high-intensity rainfalls, and these can be conditioning vegetation growth. Such local changes will be further examined on a wider, regional scale.

How to cite: Hayakawa, Y. S., Lo, T., Zulhilmi, A., Yu, X., and Wang, X.: Post-earthquake geoenvironmental changes in landslide-affected watersheds in Atsuma, Hokkaido (Japan), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5689, https://doi.org/10.5194/egusphere-egu23-5689, 2023.

EGU23-5750 | Posters on site | GI6.1

Aseismic creep and coseismic dislocation at an active fault in volcanic area: the case of Ischia Island

Stefano Carlino, Nicola Alessandro Pino, Lisa Beccaro, and Prospero De Martino

Understanding the fault dynamics in volcanic areas is not a simple task, mainly due to both the heterogeneity of volcanic structures and the local stress distribution. The presence of high temperature-high pressure geothermal fluids and relative high strain rates, and the occurrence of viscous processes in the deeper part of the volcano further contribute to generate complex patterns of strain load and release, possibly with aseismic creep and differential movements along the faults.

We present the case of an active fault located Casamicciola Terme town – in northern area of the volcanic caldera of Ischia Island (Southern Italy) – where repeated destructive earthquakes occurred at least since 1769, even causing thousands of victims in a single event, with the last one striking in 2017. To assess a possible mechanism leading to the activation of the Ischia main seismogenic fault, its cyclic nature and the related hazard, we performed a joined analysis of the ground vertical movements, obtained from cGPS (2001-present), DInSAR (2015-2018) time-series, and levelling data of the island (1987-2010). The geodetic data indicate that Casamicciola seismogenic fault is characterized by a complex dynamic, with some pre- and post-seismic aseismic dislocation, along sectors that move differentially, in response to the long-term subsidence of the island. Based on the ground deformation rate and on the distribution of degassing areas, we speculate that fluid pressure variations may have a major role in modulating the apparent non-stationarity of the Ischia stronger earthquakes. Furthermore, we suggest that a punctual monitoring of the distribution in space and time of the aseismic creep could provide clues on the state of strain of the seismogenic fault.

How to cite: Carlino, S., Pino, N. A., Beccaro, L., and De Martino, P.: Aseismic creep and coseismic dislocation at an active fault in volcanic area: the case of Ischia Island, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5750, https://doi.org/10.5194/egusphere-egu23-5750, 2023.

EGU23-6832 | ECS | Orals | GI6.1

Quantifying karstic geomorphologies using Minkowski tensors and graph theory: Applications to SLAM Lidar data from carbonate caves in Northern Bavaria (Germany)

Rahul Prabhakaran, Ruaridh Smith, Daniel Koehn, Pierre-Olivier Bruna, and Giovanni Bertotti

Karstification is a ubiquitous feature in carbonate rocks. The origins can be hypogenic or epigenic based on the source of the reacting fluids. The presence of karstified lithologies and their spatial heterogeneity poses a major risk in subsurface energy utilization goals (hydrocarbons, geothermal etc). Such dissolution features tend to organize as spatial networks, with their evolution controlled by a complex interplay of several factors, including natural mineralogical variations in host rocks, effects of pre-existing structures, directional history of palaeo-flow paths, and competition between convective transport and dissolution. Accurate quantification of the spatial distribution of karst is difficult owing to resolution issues in 3D data such as seismic and ground penetrating radar. Recent advances in Simultaneous Location and Mapping (SLAM) Lidar technology have made possible to acquire karst cave passage geometries at very high-resolution with relative ease compared to conventional terrestrial lidar. In this contribution, we present a unique dataset of more than 80 caves, scanned using SLAM lidar, in Jurassic carbonates from northern Bavaria, Germany. We introduce a methodology for robustly deriving morphometrics of karstic caves using Minkowski tensors and spatial graph theory. The method is based on a combination representation of cave passage skeletons as spatial graphs and 2D passage cross-sections using Minkowski functionals. The enriched topological representation enables detailed analysis of internal spatial variation within a single cave and also comparison with cave geometries from other caves. We derive a typology of cave systems based on the degree of structural control on karstification using the database.

How to cite: Prabhakaran, R., Smith, R., Koehn, D., Bruna, P.-O., and Bertotti, G.: Quantifying karstic geomorphologies using Minkowski tensors and graph theory: Applications to SLAM Lidar data from carbonate caves in Northern Bavaria (Germany), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6832, https://doi.org/10.5194/egusphere-egu23-6832, 2023.

EGU23-7265 | Posters on site | GI6.1

Low Power, Rugged Edge Computing provides a low cost, powerful solution for on the ground remote sensing in extreme environments

Nicholas Frearson, Terry Plank, Einat Lev, LingLing Dong, and Conor Bacon

Ground based remote sensing devices increasingly incorporate low cost single board computers such as a Raspberry Pi to capture and analyze images and data from the environment. Useful and cheap as these devices are, they are not designed for use in extreme conditions and as a consequence often suffer from early failure. Here we describe a system that incorporates a commercially available rugged Edge Computer running embedded Linux that is designed to operate in remote and extreme environments. The AVERT system (Anticipating Volcanic Eruptions in Real Time) developed at Columbia University in New York and funded by the Moore Foundation uses solar and wind powered Sensor nodes configured in a spoke and hub architecture currently operating on two volcanoes overseen by the Alaska Volcano Observatory in the Aleutian Islands, Alaska. Multiple Nodes distributed around the volcanoes are each controlled by an Edge Computer which manages and monitors local sensors, processes and parses their data via radio link to a central Hub and schedules system components to wake and sleep to conserve power. The Hub Edge Computer collects and assembles data from multiple Nodes and passes it via satellite, cellular modem or radio links to servers located elsewhere in the world or cloud for near real-time analysis. The local computer enables us to minimize local power demand to just a few watts in part due to the extremely low power sleep modes that are incorporated into these devices. For instance, a Node incorporating a webcam, IRCam, weather station, Edge Computer, network switch, communications radio and power management relays draws only 4.5W on average. In addition, this level of local computing power and a mature Linux operating environment enables us to run AI algorithms at source that process image and other data to flag precursory indicators of an impending eruption. This also helps to reduce data volume passed across the network at times of low network connectivity. We can also remotely interrogate any part of the system and implement new data schemes to best monitor and react to ongoing events. Future work on the AI algorithm development will incorporate local multisensor data analytics to enhance our anticipatory capability.

How to cite: Frearson, N., Plank, T., Lev, E., Dong, L., and Bacon, C.: Low Power, Rugged Edge Computing provides a low cost, powerful solution for on the ground remote sensing in extreme environments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7265, https://doi.org/10.5194/egusphere-egu23-7265, 2023.

EGU23-8673 | Orals | GI6.1

Are they radon or random signals? Analysis of time series of 222Rn activity concentrations in populated areas of La Palma (Canary Islands, Spain)

Antonio Eff-Darwich, Germán D. Padilla, José Barrancos, José A. Rodríguez-Losada, Pedro A. Hernández, Nemesio M. Pérez, Antonio J. Álvarez Díaz, Alexis M. González Pérez, Jesús García, José M. Santana, and Eleazar Padrón

Radon, ²²²Rn, is a radioactive constituent of the surface layer of the atmosphere. The analysis of the temporal and spatial variations in the flux of radon across the soil–air interface is a promising tool to study geo-dynamical processes. However, many of these variations are induced by external variables, such as temperature, barometric pressure, rainfall, or the location of the instrumentation, among others.

Anomalous CO₂ degassing has been observed since the end of November 2021 in the neighborhoods of La Bombilla and Puerto Naos, located in the western flank of La Palma, about 5 km distance southwestern of the 2021 Tajogaite eruption vents (Hernández et al. 2022). In order to complement these observations with other independent parameters, a set of radon monitoring stations have been deployed in that area. In an attempt to filter out non-endogenous variations in the radon signal, we have implemented time-series numerical filtering techniques based on multi-variate and frequency domain analysis. A background level for radon emissions at various locations could therefore be defined, by which correlations between radon concentration, gaseous emissions and dynamical processes could be carried out. Some preliminary results corresponding to the first 3 months of data (october-december 2022) are presented.

Hernández, P. A., Padrón, E., Melián, G. V., Pérez, N. M., Padilla, G., Asensio-Ramos, M., Di Nardo, D., Barrancos, J., Pacheco, J. M., and Smit, M.: Gas hazard assessment at Puerto Naos and La Bombilla inhabited areas, Cumbre Vieja volcano, La Palma, Canary Islands, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7705, https://doi.org/10.5194/egusphere-egu22-7705, 2022.

How to cite: Eff-Darwich, A., Padilla, G. D., Barrancos, J., Rodríguez-Losada, J. A., Hernández, P. A., Pérez, N. M., Álvarez Díaz, A. J., González Pérez, A. M., García, J., Santana, J. M., and Padrón, E.: Are they radon or random signals? Analysis of time series of 222Rn activity concentrations in populated areas of La Palma (Canary Islands, Spain), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8673, https://doi.org/10.5194/egusphere-egu23-8673, 2023.

EGU23-8795 | ECS | Orals | GI6.1

Integration of Seismic and Quasi-Static Signals for Improved Volcanic Monitoring

Joe Carthy, Alejandra Vásquez Castillo, Manuel Titos, Luciano Zuccarello, Flavio Cannavò, and M. Carmen Benitez

The time scale of ground displacement at volcanoes varies between short, sub second seismic events, to days, months or even years. This study is focused on data from seismic and GNSS stations located around Mount Etna. The GNSS and seismic stations operate at different time scales. Data from these different time scales is extracted and combined in order to better understand the subsurface dynamics. The overall aim of this research is to improve volcanic forecasting and monitoring. It does this in a novel way by applying signal processing and machine learning techniques to the rich dataset.

Mount Etna offers an interesting case study as it is a widely monitored volcano with a variety of sensors and with a rich pool of data to analyse. Additionally the volcanic dynamics at Mount Etna are complex. This is a volcano where there is a variety of different sub-surface dynamics due to the movement of both deep and shallow magma. This allows for rich insights to be drawn through the combination of different signal types.

This study looks at combining the information obtained from the seismic array at Mount Etna, with the information obtained from various GNSS stations on the volcano. The seismic array has been able to capture ground velocity data in the frequency range 0.025 Hz to 50 Hz from a range of stations at different locations across the volcano. The GNSS stations measure ground displacement with a sampling frequency of 1 Hz, and they allow for longer term ground dynamic analysis.

We analyse different seismic events, and relate the type and number of the seismic events to the long term ground deformation that we see in the recorded GNSS data. Where links between the two signal types have been identified, research is ongoing to establish a direct connection with known volcanic activity on Mount Etna. This will help establish what the relationship that we are seeing signifies. This integration of data from different types of sensors is a significant step into bridging the gap between seismic and quasi-static ground displacement at active volcanoes and should open the path toward more in depth volcanic monitoring and forecasting.

How to cite: Carthy, J., Vásquez Castillo, A., Titos, M., Zuccarello, L., Cannavò, F., and Benitez, M. C.: Integration of Seismic and Quasi-Static Signals for Improved Volcanic Monitoring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8795, https://doi.org/10.5194/egusphere-egu23-8795, 2023.

EGU23-10069 | ECS | Orals | GI6.1

Vredefort impact site modelling through inhomogeneous depth weighted inversion.

Andrea Vitale and Maurizio Fedi

We are showing an application of the 3D self-constrained depth weighted inversion of the inhomogeneous gravity field (Vitale and Fedi, 2020) of the Vredefort impact site.

This method is based on two steps, the first being the search in the 3D domain of the homogenous degree of the field, and the second being the inversion of the data using a power-law weighting function with a 3D variable exponent. It does not involve directly data at different altitudes, but it is heavily conditioned by a multiscale search of the homogeneity degree.

The main difference between this inversion approach and the one proposed by Li and Oldenburg algorithm (1996) and Cella and Fedi (2012) is therefore about the depth weighting function, whose exponent is a constant through the whole space in the original Li and Oldenburg and Cella and Fedi approaches, while it is a 3D function in the method which we will discuss here.

The model volume of the area reaches 20 km in depth, while along x and y its extension is respectively 41 by 63 km. The trend at low and middle altitudes of the estimated β related to the main structures is fitting the expectations because the results relate to two main structures, which are geometrically different: the core is like a spheroid body (β ≈ 3) and the distal rings are like horizontal pipes or dykes (1 < β < 2).

With a homogeneous depth weighting function, we recover a smooth solution and both the main sources, the main core and the rings of the impact, are still visible at the bottom of the model (20 km). This is not in agreement with the result by Henkel and Reimold (1996, 1998), which, based on gravity and magnetic inversion supported by seismic data, proposed a model where the bottom of the rings is around 10 km and the density contrast effect due to the core structure loses its effectiveness around 15 km.

Instead, using an inhomogeneous depth weighting function (figure 28) we can retrieve information regarding the position at depth of both core and distal ring structures that better fits the above model. In fact, the bottom of the distal ring structure, that should be around 10 km according to Henkel and Reimold (1996, 1998), is recovered very well using an inhomogeneous depth weighting function, while in the homogeneous case we saw that the interpreted structure was still visible at large depths.

In addition, also the core structure is shallower compared to the homogeneous approach and seems more reliable if we compare it with the model of Henkel and Reimold (1996, 1998).

Instead, the inhomogeneous approach presented in this paper leads naturally us to a better solution because it takes into account during the same inversion process of the inhomogeneous nature of the structural index within the entire domain.

How to cite: Vitale, A. and Fedi, M.: Vredefort impact site modelling through inhomogeneous depth weighted inversion., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10069, https://doi.org/10.5194/egusphere-egu23-10069, 2023.

EGU23-11065 | ECS | Posters on site | GI6.1

The Dynamics of Climate Change Science and Policy in Panama: A Review

Gustavo Cárdenas-Castillero, Steve Paton, Rodrigo Noriega, and Adriana Calderón

The local studies and reports indicate that the temperature of Panama has increased by approximately 1°C since the 1970s. More evidence shows a constantly rising sea level in the Guna Yala archipelago, coral bleaching on both coasts, and increasingly more frequent and extreme precipitation events throughout Panama. This study includes an analysis of over 400 scientific publications made by researchers from multiple centres and more than 20 Panamanian official reports due to Panama's mandate and duties under the international climate accords. To summarise the results, the studies were gathered according to the climate change effects by Panamanian locations and analysed posteriorly using Rstudio and ArcMAP. The results indicate a significant increase in climate change research beginning in 2007.

This study identified and examined the essential findings per hydroclimatic region, showing the trends, limitations, collaborations, and international contributions. Climate change research in Panama includes some of the longest-term meteorological, hydrological, oceanographic, and biological studies in the neotropics. The most significant number of identified climate change-related studies were conducted, at least in part, in the Barro Colorado Natural Monument located in central Panama. Other frequently used sites include Metropolitan Natural Park, Soberania Park, the Panama Canal Watershed and the Caribbean coast of Colón and Bocas del Toro, primarily due to research conducted by Smithsonian Tropical Research-affiliated investigators. The tropical forests of Panama are some of the bests studied in the world; however, research has been concentrated in a relatively small number of locations and should be expanded to include additional areas to achieve a more complete and comprehensive understanding of climate change will impact Panama in the future.

How to cite: Cárdenas-Castillero, G., Paton, S., Noriega, R., and Calderón, A.: The Dynamics of Climate Change Science and Policy in Panama: A Review, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11065, https://doi.org/10.5194/egusphere-egu23-11065, 2023.

EGU23-12050 | Orals | GI6.1

Stress field analysis from induced earthquakes caused by deep fluid injection: the 2013 St. Gallen (Switzerland) seismic sequence.

Bruno Massa, Guido Maria Adinolfi, Vincenzo Convertito, and Raffaella De Matteis

The city of St. Gallen is located in the Molasse Basin of northeast Switzerland. Mesozoic units of the substratum are affected by a fault system hosting a hydrothermal reservoir. In 2013 a deep geothermal drilling project started in an area close to the city. During a phase of reservoir stimulation, a sequence of more than 340 earthquakes was induced with a maximum magnitude M_L 3.5. Stress inversion of seismological datasets became an essential tool to retrieve the stress field of active tectonics areas. With this aim, a dataset of the best constrained Fault Plane Solutions (FPSs) was processed in order to qualitatively retrieve stress-fields active in the investigated volume. FPSs were obtained by jointly inverting the long-period spectral-level P/S ratios and the P-wave polarities following a Bayesian approach (BISTROP). Data were preliminarily processed by the Multiple Inverse Method to evaluate the possible dataset heterogeneity and separate homogeneous FPS populations. The resulting dataset was then processed using the Bayesian Right Trihedra Method (BRTM). Considering that hypocentral depths range between 4.1 and 4.6 km b.s.l., in order to emphasize depth-related stresses, we performed a first step of raw stress inversion procedure splitting the data into five subsets, grouping events located inside 100-m depth ranges. Once the presence of stress variations with depth has been excluded, the second step of fine stress inversion procedure was performed on the entire dataset. The stress-inversion procedure highlights an active stress field dominated by a well-constrained NE low-plunging σ₃ and a corresponding NW low-plunging σ₁. The corresponding Bishop ratio confirms the stability of the retrieved attitudes. Results are in good accordance with the regional stress field derived from regional natural seismicity. Additionally, the retrieved, dominant, stress field is coherent with the regional tectonic setting.

This research has been supported by PRIN-2017 MATISSE project (No. 20177EPPN2).

How to cite: Massa, B., Adinolfi, G. M., Convertito, V., and De Matteis, R.: Stress field analysis from induced earthquakes caused by deep fluid injection: the 2013 St. Gallen (Switzerland) seismic sequence., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12050, https://doi.org/10.5194/egusphere-egu23-12050, 2023.

EGU23-13693 | ECS | Orals | GI6.1

Assessing the transfer factors (TFs) of contaminants from soil to plants: the case study of Campania region (Southern Italy)

Lucia Rita Pacifico, Annalise Guarino, Gianfranco Brambilla, Antonio Pizzolante, and Stefano Albanese

The presence of potentially toxic elements (PTEs) derived from anthropogenic sources in soil represents a serious issue for animal and human health. These elements can easily move from the geological compartment to the biological compartment through to the food chain. (Jarup, 2003).

The geochemical knowledge of a territory allows to assess the degree of contamination of the environment, to locate the sources of environmental hazard and, possibly, to manage the anomalous concentrations of the PTEs in environmental matrices with the purpose of eliminating or minimizing their negative impact on the health of living beings. (Reimann et al. 2005).

Several studies have been already carried out to determine the distribution patterns of PTEs in the soil of Campania region (Southern Italy) (De Vivo et al., 2022) but little is known about the transfer processes of contaminants from soils to agricultural products.

In light of above, we present the results of a new study whose purpose was to determine the Transfer Factors (TFs) of PTEs from soil to a series of agricultural products commonly grown in Campania.

Considering the complex geological and geomorphological settings of the region and the diffuse presence of an historical anthropization related to the industry, agriculture, and urbanization, TFs were calculated for a relevant number of fruit and vegetable samples (3731 specimens). They were collected across the whole regional territory to detect differences between analysed species and to highlight the spatial changes in TFs occurring for individual species.

The TFs were calculated starting from the quasi-total (based on Aqua Regia leaching) and bioavailable (based on Ammonium Nitrate leaching) concentrations of PTEs in 7000 and 1500 soil samples, respectively.

Preliminary results show that TFs determined for the various agricultural species vary in space and in amount independently from the original elemental concentrations in soils. High values of TFs are found in areas where PTE concentrations in soil are low and vice versa, thus suggesting that multiple regression and multivariate analyses could be performed to investigate if some additional chemical and physical characteristics of soil (pH, grainsize, OM, etc.) could have a relevant weight on the transfer processes of contaminant from the soil to the plant life.

References

Järup L. 2003. Hazards of heavy metal contamination. Br. Med. Bull. 68, 167–182.

Reimann C., de Caritat P. 2005. Distinguishing between natural and anthropogenic sources for elements in the environment: regional geochemical surveys versus enrichment factors. Science of The Total Environment, Volume 337, Issues 1–3, pages 91-107.

De Vivo B. et al. 2022. Monitoraggio geochimico-ambientale dei suoli e dell'aria della Regione Campania. Piano Campania trasparente. Volume 4. Aracne Editore, Genzano di Roma.

How to cite: Pacifico, L. R., Guarino, A., Brambilla, G., Pizzolante, A., and Albanese, S.: Assessing the transfer factors (TFs) of contaminants from soil to plants: the case study of Campania region (Southern Italy), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13693, https://doi.org/10.5194/egusphere-egu23-13693, 2023.

EGU23-13853 | Posters on site | GI6.1

Analysis and Modelling of 2009-2013 vs. 2019-2022 Unrest Episodes at Campi Flegrei Caldera

Raffaele Castaldo, Andrea Barone, De Novellis Vincenzo, Pepe Antonio, Pepe Susi, Solaro Giuseppe, Tizzani Pietro, and Tramelli Anna

Geodetic modelling is a significant procedure for detecting and characterizing unrest and eruption episodes and it represents a valuable tool to infer volume and geometry of volcanic source system.

In this study, we analyse the 2009–2013 and the ongoing 2019-2022 uplift phenomena at Campi Flegrei (CF) caldera in terms of spatial and temporal variations of the stress/strain field. In particular, we investigate the characteristics of the inflating sources responsible of these main deformation unrests occurred in the last twenty years. We separately perform for the two considered periods a 3D stationary Finite Element (FE) modelling of geodetic datasets to retrieve the geometry and location of the deformation sources. The geometry of FE domain takes into account both the topography and the bathymetry of the whole caldera. For what concern the definition of domain elastic parameters, we take into account the Vp/Vs distribution from seismic tomography. In order to optimize the nine model parameters (center coordinates, sferoid axes, dip, strike and over-pressure), we use the statistical random sampling Monte Carlo method by exploiting both geodetic datasets: the DInSAR measurements obtained from the processing of COSMO-SkyMed and Sentinel-1 satellite images. The modelling results for the two analysed period are compared revealing that the best-fit source is a three-axis oblate spheroid ~3.5 km deep, similar to a sill-like body. Furthermore, in order to verify the reliability of the geometry model results, we calculate the Total Horizontal Derivative (THD) of the vertical velocity component and compare it with those performed directly on the two DInSAR dataset.

Finally, we compare the modelled shear stress with the natural seismicity recorded during the 2000-2022 period, highlighting high values of modelled shear stress at depths of about 3.5 km, where high-magnitude earthquakes nucleate.

How to cite: Castaldo, R., Barone, A., Vincenzo, D. N., Antonio, P., Susi, P., Giuseppe, S., Pietro, T., and Anna, T.: Analysis and Modelling of 2009-2013 vs. 2019-2022 Unrest Episodes at Campi Flegrei Caldera, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13853, https://doi.org/10.5194/egusphere-egu23-13853, 2023.

EGU23-15127 | ECS | Orals | GI6.1

Multiscale magnetic modelling in the ancient abbey of San Pietro in Crapolla

Luigi Bianco, Maurizio Fedi, and Mauro La Manna

We present a multiscale analysis of magnetic data in the archaeological site of San Pietro in Crapolla (Massa Lubrense, near Naples, Italy). The site consists of the ruins of an ancient abbey. We computed the Wavelet Transform of the Gradiometric measurements and decomposed the data at different scales and positions by a multiresolution analysis, allowing an effective extraction of local anomalies. Modelling of the filtered anomalies was performed by multiscale methods known as “Multiridge analysis” and “DEpth from eXtreme Points (DEXP)”. The first method analyses a multiscale dataset at the zeroes of the first horizontal and vertical derivatives besides the potential field data themselves (ridges). The Wavelet Transform Modulus Maxima lines converged to buried remains. The field, scaled by a power law of the altitude (DEXP transformation) allowed estimates of source depths at its extreme points. The depth estimations for the buried structures obtained from the two methods are very close each other and fairly agree with those from the modelling of GPR anomalies. On the basis of these results, an archaeological excavation followed our indications and brought to light ancient walls.

How to cite: Bianco, L., Fedi, M., and La Manna, M.: Multiscale magnetic modelling in the ancient abbey of San Pietro in Crapolla, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15127, https://doi.org/10.5194/egusphere-egu23-15127, 2023.

EGU23-15190 | Orals | GI6.1

Synthetic aperture radar burst overlapped interferometry for the analysis of large ground instabilities: Experiments in volcanic regions.

Antonio Pepe, Andrea Barone, Pietro Mastro, Pietro Tizzani, and Raffaele Castaldo

This work presents an overview of some applications of synthetic aperture radar (SAR) interferometry technology for the detection and analysis of large ground displacements occurring in volcanic areas, with the aim to retrieve the three-dimensional (3-D) ground displacement field (up-down, east-west, north-south). Specifically, the work summarizes and investigates the potential of Bursted Overlapped Interferometry (BOI) that properly combined can allow the retrieval, at different scales of resolution and accuracies, of the north-south components of the ground deformations, which are usually not available considering conventional SAR interferometry techniques. In this context, the almost global coverage and the weekly revisit times of the European Copernicus Sentinel-1 SAR sensors permit nowadays to perform extensive analyses with the aim to assess the accuracy of the BOI techniques. More recently, Spectral Diversity (SD) methods have been exploited for the fine co-registration of SAR data acquired with the Terrain Observation with Progressive Scans (TOPS) mode. In this case, considering that TOPS acquires images in a burst mode, there is an overlap region between consecutive bursts where the Doppler frequency variations is large enough to allow estimating and compensating for, with great accuracy, potential bursts co-registration errors. Additionally, and more importantly, in the case of non-stationary scenarios, it allows detecting the ground displacements occurring along the azimuthal directions (almost aligned along north-south) with centimeter accuracy. This is done by computing the difference between the right and left interferograms, i.e., the burst overlapped interferogram, and relating it to the ongoing deformation signals.

This work aims to apply the BOI technique in selected volcanic and seismic areas to evaluate the impact of this novel technology for the analysis of quantifying, over small, covered regions, the accumulated ground displacements in volcanic areas. In such regions, the interest is on quantifying the accuracy of integrated BOI systems for the retrieval of 3-D displacements. To this aim, we selected as a test site the Galapagos Island and we analyze with BOI the north-south ground displacements. At the next EGU symposium, the results of the BOI analyses will be presented, thus also providing comparative analyses with the results obtained from the use of potential field method applied on the ground displacements in volcanic areas. More specifically, by adopting this technique, we are able to estimate independently the north-south components of the ground displacement by exploiting the harmonic properties of the elasticity field.

How to cite: Pepe, A., Barone, A., Mastro, P., Tizzani, P., and Castaldo, R.: Synthetic aperture radar burst overlapped interferometry for the analysis of large ground instabilities: Experiments in volcanic regions., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15190, https://doi.org/10.5194/egusphere-egu23-15190, 2023.

EGU23-16132 | ECS | Orals | GI6.1

Multiscale imaging of low-enthalpy geothermal reservoir of the Phlegraean Fields caldera from gravity and resistivity data.

Maurizio Milano, Giuseppe Cavuoto, Alfonso Corniello, Vincenzo Di Fiore, Maurizio Fedi, Nicola Massarotti, Nicola Pelosi, Michele Punzo, Daniela Tarallo, Gian Paolo Donnarumma, and Marina Iorio

The central‐eastern sector of the Phlegraean Fields caldera, southern Italy, is one of the most intensely studied and monitored volcanic active area of the word. This area reveals typical characters of a high‐ enthalpy geothermal systems. However, recently the presence of two different geothermal reservoirs has been outlined: one located in the central sector dominated by highly active vapours generated by episodic arrival of CO₂‐rich magmatic fluids and the other one located in the eastern sector (Agnano zone) characterized by a shallow (400-500 m b.s.l.) still hot reservoir, heated by the upward circulation of deep no magmatic hot vapor.

In this study we present preliminary results deriving from the integration of different geophysical surveys carried out in the Agnano plain area, in the frame of the GEOGRID research project. We acquired high-resolution gravity data along two parallel profiles and we investigated the depth, shape and density contrast of the subsurface structures by the CompactDEXP (CDEXP) method, a multiscale iterative imaging technique based on the DEXP method. The resulting density models, together with DC resistivity and stratigraphic data, outlines the presence of a complex morphology of the Agnano subsoil characterized by a horst-graben structure. The importance of the structural lines identified by geophysical data, is also confirmed by the alignment of correlate outcropping thermal waters.

How to cite: Milano, M., Cavuoto, G., Corniello, A., Di Fiore, V., Fedi, M., Massarotti, N., Pelosi, N., Punzo, M., Tarallo, D., Donnarumma, G. P., and Iorio, M.: Multiscale imaging of low-enthalpy geothermal reservoir of the Phlegraean Fields caldera from gravity and resistivity data., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16132, https://doi.org/10.5194/egusphere-egu23-16132, 2023.

EGU23-1981 | ECS | Posters on site | GI3.3

Total stratospheric bromine inferred from balloon-borne solar occultation bromine oxide (BrO) measurements using the new TotalBrO instrument

Karolin Voss, Philip Holzbeck, Ralph Kleinschek, Michael Höpfner, Gerald Wetzel, Björn-Martin Sinnhuber, Klaus Pfeilsticker, and André Butz

Halogenated organic and inorganic compounds, in particular those containing chlorine, bromine and iodine are known to contribute to the global ozone depletion as well as directly and indirectly to climate forcing. As a result of the Montreal Protocol (1987), the chlorine and bromine loadings of the stratosphere are closely monitored, while the role of iodinated compounds to the stratospheric ozone photochemistry is still uncertain.

To address the questions concerning bromine and iodine compounds, a compact solar occultation instrument (TotalBrO) has been specifically designed to measure BrO, IO (iodine oxide) and other UV/Vis absorbing gases by means of Differential Optical Absorption Spectroscopy (DOAS) from aboard a stratospheric balloon. The instrument (power consumption < 100 W) comprises of an active camera-based solar tracker (LxWxH ~ 0.40 m x 0.40 m x 0.50 m, weight ~ 12 kg) and a spectrometer unit (LxWxH ~ 0.45 m x 0.40 m x 0.40 m, weight ~ 25 kg). The spectrometer unit houses two grating spectrometers which operate in vacuum and under temperature stabilization by an ice-water bath.

We discuss the performance of the TotalBrO instrument during the first two deployments on stratospheric balloons launched from Kiruna in August, 2021 and from Timmins in August, 2022 within the HEMERA program. Once the balloon gondola was azimuthally stabilized the solar tracker was able to follow the sun with a 1σ precision lower than 0.02° up to solar zenith angles (SZAs) of 95°. The spectral retrieval (of 46 spectra acquired at SZA between 84° and 90°) allowed us to infer the BrO mixing ratio above 32 km altitude. The total bromine in the middle stratosphere is inferred by accounting for the BrO/Br_ypartitioning derived from a photochemical model.

How to cite: Voss, K., Holzbeck, P., Kleinschek, R., Höpfner, M., Wetzel, G., Sinnhuber, B.-M., Pfeilsticker, K., and Butz, A.: Total stratospheric bromine inferred from balloon-borne solar occultation bromine oxide (BrO) measurements using the new TotalBrO instrument, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1981, https://doi.org/10.5194/egusphere-egu23-1981, 2023.

EGU23-2923 | ECS | Posters on site | GI3.3

Total organic carbon measurements reveal large discrepancies in reported petrochemical emissions

Megan He, Jenna Ditto, Lexie Gardner, Jo Machesky, Tori Hass-Mitchell, Christina Chen, Peeyush Khare, Bugra Sahin, John Fortner, Katherine Hayden, Jeremy Wentzell, Richard Mittermeier, Amy Leithead, Patrick Lee, Andrea Darlington, Junhua Zhang, Samar Moussa, Shao-Meng Li, John Liggio, and Drew Gentner

Oil sands are a prominent unconventional source of petroleum. Total organic carbon measurements via an aircraft campaign (Spring-Summer 2018) revealed emissions above Canadian oil sands exceeding reported values by 1900-6300%. The “missing” compounds were predominantly intermediate- and semi-volatile organic compounds, which are prolific precursors to secondary organic aerosol formation.

Here we use a novel combination of aircraft-based measurements (including total carbon emissions measurements) and offline analytical instrumentation to characterize the mixtures of organic carbon and their volatility distributions above oil sands facilities. These airborne, real-time observations are supplemented by laboratory experiments identifying substantial, unintended emissions from waste management practices, emphasizing the importance of accurate facility-wide emissions monitoring and total carbon measurements to detect potentially vast missing emissions across sources.

Detailed chemical speciation confirms these observations near both surface mining and in-situ facilities were oil sands-derived, with facility-wide emissions around 1% of extracted petroleum—a comparable loss rate to natural gas extraction. Total emissions, spanning extraction through waste processing, were equivalent to total Canadian anthropogenic emissions from all sources. These results demonstrate that the full air quality and environmental impacts of oil sands operations cannot be captured without complete coverage of a wider volatility range of emissions.

How to cite: He, M., Ditto, J., Gardner, L., Machesky, J., Hass-Mitchell, T., Chen, C., Khare, P., Sahin, B., Fortner, J., Hayden, K., Wentzell, J., Mittermeier, R., Leithead, A., Lee, P., Darlington, A., Zhang, J., Moussa, S., Li, S.-M., Liggio, J., and Gentner, D.: Total organic carbon measurements reveal large discrepancies in reported petrochemical emissions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2923, https://doi.org/10.5194/egusphere-egu23-2923, 2023.

EGU23-3473 | Posters on site | GI3.3 | Highlight

The FAAM large atmospheric research aircraft: a brief history and future upgrades

James Lee

The UK’s large atmospheric research aircraft is a converted BAe 146 operated by the Facility for Airborne Atmospheric Measurements (FAAM). With a range of 2000 nautical miles, the FAAM aircraft is capable of operating all over the world and it has taken part in science campaigns in over 30 different countries since 2004. The aircraft can fly as low as 50 feet over the sea and sustain flight at 100 feet high. The service ceiling is nearly 11 km high. Typically, flights will last anywhere between one and six hours, and we will carry up to 18 scientists onboard, who guide the mission and support the operation of up to 4 tonnes of scientific equipment. Currently, the aircraft is undergoing a £49 million mid-life upgrade (MLU) program, which will extend its lifetime to at least 2040. The three overarching objectives of the MLU are to:

Safeguard the UK’s research capability – allowing the facility to meet the needs of the research community, enhance the range of services available, and respond to environmental emergencies.

Provide frontier science capability – meeting new and existing research needs and supporting ground-breaking science discoveries, with a flexible and world-class airborne laboratory.

Reduce environmental impact – maintaining and improving the performance of the facility, and minimising emissions and resource use from aircraft operation.

Presented here will be a brief history of the aircraft operations, including example science outcomes from all flights all over the world. In addition, detail of the ongoing upgrades, in particular the new and cutting-edge measurement capability for gases, aerosols, clouds, radiation and meteorology. Also presented will be the expected reductions in environmental impact of the aircraft and how these will be monitored.

How to cite: Lee, J.: The FAAM large atmospheric research aircraft: a brief history and future upgrades, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3473, https://doi.org/10.5194/egusphere-egu23-3473, 2023.

EGU23-6620 | ECS | Posters on site | GI3.3

Airborne observations over the North Atlantic Ocean reveal the first gas-phase measurements of urea in the atmosphere

Emily Matthews, Thomas Bannan, M. Anwar Khan, Dudley Shallcross, Harald Stark, Eleanor Browne, Alexander Archibald, Stéphane Bauguitte, Chris Reed, Navaneeth Thamban, Huihui Wu, James Lee, Lucy Carpenter, Ming-xi Yang, Thomas Bell, Grant Allen, Carl Percival, Gordon McFiggans, Martin Gallagher, and Hugh Coe

Despite the reduced nitrogen (N) cycle being central to global biogeochemistry, there are large uncertainties surrounding its sources and rate of cycling. Here, we present the first observations of gas-phase urea (CO(NH₂)₂) in the atmosphere from airborne high-resolution mass spectrometer measurements over the North Atlantic Ocean. We show that urea is ubiquitous in the marine lower troposphere during the Summer, Autumn and Winter flights but was found to be below the limit of detection during the Spring flights. The observations suggest the ocean is the primary emission source but further studies are required to understand the processes responsible for the air-sea exchange of urea. Urea is also frequently observed aloft due to long-range transport of biomass-burning plumes. These observations alongside global model simulations point to urea being an important, and as yet unaccounted for, component of reduced-N to the remote marine environment. Since we show it readily partitions between gas and particle phases, airborne transfer of urea between nutrient rich and poor parts of the ocean can occur readily and could impact ecosystems and oceanic uptake of CO₂, with potentially important atmospheric implications.

How to cite: Matthews, E., Bannan, T., Khan, M. A., Shallcross, D., Stark, H., Browne, E., Archibald, A., Bauguitte, S., Reed, C., Thamban, N., Wu, H., Lee, J., Carpenter, L., Yang, M., Bell, T., Allen, G., Percival, C., McFiggans, G., Gallagher, M., and Coe, H.: Airborne observations over the North Atlantic Ocean reveal the first gas-phase measurements of urea in the atmosphere, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6620, https://doi.org/10.5194/egusphere-egu23-6620, 2023.

EGU23-7804 | Posters virtual | GI3.3

In-situ trace-gas measurements from the ground to the stratosphere by an OF-CEAS balloon-borne instrument

Valery Catoire, Chaoyang Xue, Gisèle Krysztofiak, Patrick Jacquet, Michel Chartier, and Claude Robert

EGU23-7986 | ECS | Posters on site | GI3.3

Ship emissions and apparent sulphur fuel content measured of board of a large research aircraft in international waters and Sulphur Emission Control Area

Dominika Pasternak, James Lee, Beth Nelson, Magdalini Alexiadou, Loren Temple, Stéphane Bauguitte, Steph Batten, James Hopkins, Stephen Andrews, Emily Mathews, Thomas Bannan, Huihui Wu, Navaneeth Thamban, Nicholas Marsden, Ming-Xi Yang, Thomas Bell, Hugh Coe, and Keith Bower

Since 1^st January 2020 the legal sulphur content of shipping fuel was decreased – from 3.5% to 0.5% by mass outside of the Sulphur Emission Control Areas (SECAs) to improve coastal air quality. A possible downside of this change was acceleration of climate change since sulphur is believed to be a negative climate forcer and sipping is one of its main sources. Further question was the level of compliance to the new rules, especially in the open waters. Another climate related aspect of shipping is recent growth in the liquified natural gas (LNG) tanker fleets. LNG is considered the greenest of the fossil fuels, however there are few empirical studies of methane emissions from marine LNG transport.

The Atmospheric Composition and Radiative forcing changes due to UN International Ship Emissions regulations (ACRUISE) project aims to address the above considerations. During three field campaigns the FAAM Airborne Laboratories’ large research aircraft was deployed to target ships in coastal shipping lanes and open waters. First measurements were performed in July 2019 (before regulation change) in shipping lanes along the Portuguese coast, the English Channel SECA and the Celtic Sea. Further two campaigns were delayed by the COVID-19 pandemic until September 2021 and April 2022, targeting ships in the Bay of Biscay, the English Channel SECA and the Celtic Sea. Throughout the project, nearly 300 ships were measured during 30 research flights, varying from plume aging and cloud interaction studies, through collecting bulk statistics in busy shipping lanes to comparing emissions in and out of SECA. This work focuses on the gaseous species measurements (SO₂, CO₂, CH₄ and VOCs from whole air samples). They are used to study changes in apparent sulphur fuel content of the ships observed throughout ACRUISE, plume composition and methane emissions from LNG tankers.

How to cite: Pasternak, D., Lee, J., Nelson, B., Alexiadou, M., Temple, L., Bauguitte, S., Batten, S., Hopkins, J., Andrews, S., Mathews, E., Bannan, T., Wu, H., Thamban, N., Marsden, N., Yang, M.-X., Bell, T., Coe, H., and Bower, K.: Ship emissions and apparent sulphur fuel content measured of board of a large research aircraft in international waters and Sulphur Emission Control Area, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7986, https://doi.org/10.5194/egusphere-egu23-7986, 2023.

EGU23-8329 | ECS | Posters on site | GI3.3 | Highlight

Airborne remote sensing research infrastructure for strengthening science, international collaboration and capacity building in the Arctic

Shridhar Jawak, Agnar Sivertsen, William D. Harcourt, Rudolf Denkmann, Ilkka Matero, Øystein Godøy, and Heikki Lihavainen

Svalbard Integrated Arctic Earth Observing System (SIOS) is an international collaboration of 28 scientific institutions from 10 countries to build a collaborative research infrastructure that will enable better estimates of future environmental and climate changes in the Arctic. SIOS' mission is to develop an efficient observing system in Svalbard, share technology and data using FAIR principles, fill knowledge gaps in Earth system science and reduce the environmental footprint of science in the Arctic. This study presents SIOS' efforts to strengthen science, international collaboration and capacity building in the high Arctic archipelago of Svalbard through its airborne research infrastructure. SIOS supports the coordinated usage of its airborne remote sensing resources such as the Dornier aircraft and uncrewed aerial vehicles (UAVs) for improved research activities in Svalbard, complementing in situ and space-borne measurements and reducing the environmental footprint of research in Svalbard. Since 2019, SIOS in collaboration with its member institution Norwegian Research Centre (NORCE) installed, tested, and operationalised optical imaging sensors in the Lufttransport Dornier (DO228) passenger aircraft stationed in Longyearbyen under the SIOS-InfraNor project making it compatible with research use in Svalbard. Two optical sensors are installed onboard the Dornier aircraft; (1) the PhaseOne IXU-150 RGB camera and (2) the HySpex VNIR-1800 hyperspectral sensor. The aircraft with these cameras is configured to acquire aerial RGB imagery and hyperspectral remote sensing data in addition to its regular logistics and transport operation in Svalbard. Since 2020, SIOS has supported and coordinated around 50 flight hours to acquire airborne data using the Dornier aircraft and UAVs in Svalbard supporting around 20 scientific projects. The use of airborne imaging sensors in these projects enabled a variety of applications within glaciology, biology, hydrology, and other fields of Earth system science: Mapping glacier crevasses, generating DEMs for glaciological applications, mapping and characterising earth (e.g., minerals, vegetation), ice (e.g., sea ice, icebergs, glaciers and snow cover) and ocean surface features (e.g., colour, chlorophyll). The use of passenger aircraft warrants the following benefits: (1) regular logistics and research activities are optimally coordinated to reduce flight hours in carrying scientific observations, (2) project proposals for the usage of aircraft-based measurements facilitate international collaboration, (3) measurements conducted during 2020-21 are useful in filling the gaps in field based observations occurred due to the Covid-19 pandemic, (4) airborne data are used to train polar scientists as a part of the annual SIOS training course and upcoming data usability contest, (5) data is also useful for Arctic field safety as it can be used to make products such as high-resolution maps of crevassed areas on glaciers. In short, SIOS airborne remote sensing activities represent optimized use of infrastructure, promote capacity building, Arctic safety and facilitate international cooperation.

How to cite: Jawak, S., Sivertsen, A., Harcourt, W. D., Denkmann, R., Matero, I., Godøy, Ø., and Lihavainen, H.: Airborne remote sensing research infrastructure for strengthening science, international collaboration and capacity building in the Arctic, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8329, https://doi.org/10.5194/egusphere-egu23-8329, 2023.

EGU23-11813 | Posters on site | GI3.3

First evaluation of a 6-months Meteodrone campaign

Maxime Hervo, Julie Pasquier, Lukas Hammerschmidt, Tanja Weusthoff, Martin Fengler, and Alexander Haefele

From December 2021 to May 2022, MeteoSwiss conducted a proof of concept with Meteomatics to demonstrate the capability of drones to provide data of sufficient quality and reliability on a routine operational basis. Meteodrones MM-670 were operated automatically 8 times per night at Payerne, Switzerland. 864 meteorological profiles were measured and compared to co-localized measurements including radiosoundings and remote-sensing instruments. To our knowledge, it is the first time that Meteodrone measurements are evaluated in such an intensive campaign.

The availability of the Meteodrone measurements over the whole campaign was 75.7% with 82.2% of the flights reaching the nominal altitude of 2000m above sea level. Using the radiosondes as a reference, the quality of the Meteodrone measurements can be quantified according to WMO requirements (WMO OSCAR , 2022). Applying this method, the temperature measured by the Meteodrone can be considered as a “breakthrough”, meaning that they are a significant improvement if they are used for high resolution Numerical Weather Prediction. The Meteodrone’s humidity and wind profiles are classified as “useful” for high-resolution numerical weather predictions, suggesting they can be used for assimilation in numerical models. The quality is similar compared to the temperature measured by a microwave radiometer and the humidity measured by a Raman Lidar. However, the wind measured by a Doppler Lidar was more accurate than the estimation of the Meteodrone.

This campaign opens the door for operational usage of automatic drones for meteorological applications.

How to cite: Hervo, M., Pasquier, J., Hammerschmidt, L., Weusthoff, T., Fengler, M., and Haefele, A.: First evaluation of a 6-months Meteodrone campaign, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11813, https://doi.org/10.5194/egusphere-egu23-11813, 2023.

EGU23-13766 | ECS | Posters on site | GI3.3

How inlet tubing material affects the response time of water vapor concentration measurements

Markus Miltner, Tim Stoltmann, and Erik Kerstel

Measurements involving water in the vapor phase have to deal with the stickiness of the H₂O molecule: The associated adsorption and desorption processes can increase the response time of these measurements significantly. To achieve short response times in scientific instrument design, hydrophobic surface materials are used to reduce surface interactions in the tubing that guides the sample towards the analyzer. The study presented here focuses on the effects of the tubing material choice, length, humidity level, gas flow rate, and temperature on the observed response time. We use an Optical Feedback Cavity Enhanced Absorption Spectrometer (OFCEAS) designed for stable water isotope measurements at low water concentration (< 1000 ppm), which we connect to two bottles containing humidified synthetic air of different water concentration using 6.6-m tubing of different materials and surface treatments. Other parameters that are varied are the flow rate and the temperature of the tubing. With proper selection of tubing material and surface treatment, the contribution from the tubing to the overall response time for low water concentration isotopic measurements can be sufficiently suppressed for it to be neglected.

How to cite: Miltner, M., Stoltmann, T., and Kerstel, E.: How inlet tubing material affects the response time of water vapor concentration measurements, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13766, https://doi.org/10.5194/egusphere-egu23-13766, 2023.

EGU23-14164 | ECS | Posters virtual | GI3.3

Multi-angular airborne thermal observations: A new hyperspectral setup for simulating thermal radiation and emissivity directionality at the satellite scale

Mary Langsdale, Callum Middleton, Martin Wooster, Mark Grosvenor, and Dirk Schuettemeyer

Land Surface Temperature (LST) is a key parameter to the understanding and modelling of many Earth system processes. Viewing and illumination geometry are known to have significant impacts on remotely sensed retrieval of LST, particularly for heterogeneous regions with mixed components. However, it is difficult to accurately quantify these impacts, in part due to the challenges of retrieving high-quality data for the different components in a scene at a variety of different viewing and illumination geometries over a time period where the real surface temperature and sun-sensor geometries are invariant. Previous field studies have attempted this through observations with aircraft-mounted single-band thermal cameras to further understanding of real-world conditions, but these sensors have limited accuracies and cannot be used to consider the angular variability of emissivity or to simulate multi-band satellite observations.

To redress this, the National Centre for Earth Observation’s Airborne Earth Observatory (NAEO) have developed and manufactured a modified mount for their state-of-the-art commercial pushbroom longwave hyperspectral airborne sensor, the Specim AisaOWL (102 narrowband channels across the 7.6 – 12.6 µm region). When mounted in standard mode, the field-of-view of the OWL sensor is 24° (± 12°), however the modified mount enables off-nadir measurements up to 48°. This has the potential to evaluate both thermal radiation and spectral emissivity directionality up to and beyond the view angles of most thermal satellite sensors. With LST now classified as an Essential Climate Variable, this work is particularly relevant as it will help to improve the accuracy of retrievals from current and future satellites (e.g. LSTM, SBG, TRISHNA).

In this presentation, we first present an overview of the design modifications that enable these high-angle observations and preliminary results from test flights before detailing how this setup will be used in an upcoming joint ESA-NASA campaign dedicated to quantifying and simulating thermal radiation directionality over agricultural regions at the satellite scale.

How to cite: Langsdale, M., Middleton, C., Wooster, M., Grosvenor, M., and Schuettemeyer, D.: Multi-angular airborne thermal observations: A new hyperspectral setup for simulating thermal radiation and emissivity directionality at the satellite scale, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14164, https://doi.org/10.5194/egusphere-egu23-14164, 2023.

EGU23-14187 | ECS | Posters on site | GI3.3

Aircraft observations of NH3 from agricultural sources

Lara Noppen, Lieven Clarisse, Frederik Tack, Thomas Ruhtz, Alexis Merlaud, Martin Van Damme, Michel Van Roozendael, Dirk Schuettemeyer, and Pierre Coheur

Ammonia (NH₃) is mainly emitted in the atmosphere by anthropogenic activities, especially by agriculture. Excess emissions greatly disturb ecosystems, biodiversity, and air quality. Despite our awareness of these deleterious consequences, NH₃ concentrations are increasing in most industrialized countries. This underlines the need for more stringent regulations and good knowledge of the species gained through effective monitoring.

Since a decade, NH₃ is monitored from space, daily and globally, with thermal infrared sounders. However, their coarse spatial resolution (above 10 km) renders accurate quantification of NH₃ sources particularly challenging. Indeed, only the largest and most isolated NH₃ point sources have been identified and quantified from current observations and often only by exploiting long-term averages. To address the urgent need for better constraining NH₃ emissions, a new satellite, called Nitrosat, has been proposed in response to the 11^th ESA’s Earth Explorer call. The mission aims at mapping simultaneously NO₂ and NH₃ at a spatial resolution of 500 m at a global scale. With the support of ESA, almost 30 aircraft demonstration flights took place in Europe between 2020 and 2022. These flights mapped gapless areas of at least 10 by 20 km containing various sources of NO₂ and NH₃ using two instruments: the SWING instrument targeting NO₂ in the UV-VIS and Hyper-Cam LW measuring infrared spectra to observe NH₃.

Here we present NH₃ observations from campaigns performed in Italy in spring 2022. The Po Valley was the main target, as it is the largest (agricultural) hotspot of NH₃ in Europe. Despite the presence of large background concentrations in the Po Valley, we show that the infrared measurements are able to expose a multitude of local agricultural hotspots such as cattle farms. A particularly successful campaign covering the region from Vetto to Colorno demonstrates measurement sensitivity to the gradual increase of NH₃ background concentrations outside and inside the Po Valley. We also discuss flights carried out further south in Italy targeting other emissions of NH₃, such as those from a soda ash plant, and the emissions from a fertilizer release experiment that was organized in collaboration with a farmer. We present the measurements both at their native horizontal resolution of 4 m and downsampled at the 500 m resolution of Nitrosat.

How to cite: Noppen, L., Clarisse, L., Tack, F., Ruhtz, T., Merlaud, A., Van Damme, M., Van Roozendael, M., Schuettemeyer, D., and Coheur, P.: Aircraft observations of NH3 from agricultural sources, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14187, https://doi.org/10.5194/egusphere-egu23-14187, 2023.

EGU23-15334 | ECS | Posters on site | GI3.3

Global measurements of cloud properties using commercial aircraft

Gary Lloyd and Martin Gallagher

EGU23-17533 | ECS | Posters on site | GI3.3

Synergy of active and passive airborne observations for the evaluation of the radiative impacts of aerosols. Application to the AEROCLO-SA field campaign in Namibia

Mégane Ventura, Fabien Waquet, Gerard Brobgniez, Frederic Parol, Marc Mallet, Nicolas Ferlay, Oleg Dubovic, Philippe Goloub, Cyrille Flamant, and Paola Formenti

Aerosols have important effects on both local and global climate, as well as on clouds and precipitations. We present here some original results of the AErosol RadiatiOn and CLOud in Southern Africa (AEROCLO-sA) field campaign led in Namibia in August and September 2017. This region shows a strong response to climate change and is associated with large uncertainties in climate models. Large amounts of biomass burning aerosols emitted by vegetation fires in Central Africa are transported far over the Namibian deserts and are also detected over the stratocumulus clouds covering the South Atlantic Ocean along the coast of Namibia. Absorbing aerosols above clouds are associated with strong positive direct radiative forcing (warming) that are still underestimated in climate models (De Graaf etal.,2021). The absorption of solar radiation by absorbing above clouds may also cause a warming where the aerosol layer is located. This warming would alter the thermodynamic properties of the atmosphere, which would impact the vertical development of low-level clouds impacting the cloud top height and its brightness.

The airborne field campaign consisted in ten flights performed with the French F-20 Falcon aircraft in this region of interest. Several instruments were involved: the OSIRIS polarimeter, prototype of the next 3MI spaceborne instrument of ESA (Chauvigné etal.,2021), the LNG lidar, an airborne photometer called PLASMA, as well as fluxmeters and dropsondes used to measure thermodynamical quantities, supplemented with in situ aerosol measurements of particles size distribution.

In order to quantify the aerosols radiative impact on the Namibian regional radiative budget, we use an original approach that combines polarimeter and lidar data to derive heating rate of the aerosols. This approach is evaluated during massive transports of biomass burning particles. To calculate this parameter, we use a radiative transfer code and additional meteorological parameters, provided by the dropsondes. We will introduce, the flight of September 8, 2017, aerosol pollution was very important. Emissions and dust were carried along the Namibian coast, and an aerosol plume was observed above a stratocumulus. We will present vertical profiles of heating rates computed in the solar and thermal parts of the spectrum with this technique. Our results indicated particularly strong heating rate values retrieved above clouds due to aerosols, in the order of 8K per day, which is likely to perturbate the dynamic of the below cloud layers.

In order to validate and to quantify this new methodology, we used the flux measurements acquired during loop descents performed during dedicated parts of the flights, which provides unique measurements of flux distribution (upwelling and downwelling) and heating rates in function of the altitude.

Finally, we will discuss the possibility to apply this method to available spaceborne passive and active observations in order to provide the first estimates of heating rate profiles above clouds at global scale.

How to cite: Ventura, M., Waquet, F., Brobgniez, G., Parol, F., Mallet, M., Ferlay, N., Dubovic, O., Goloub, P., Flamant, C., and Formenti, P.: Synergy of active and passive airborne observations for the evaluation of the radiative impacts of aerosols. Application to the AEROCLO-SA field campaign in Namibia, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17533, https://doi.org/10.5194/egusphere-egu23-17533, 2023.

EGU23-4057 | ECS | Orals | GI6.2

Mapping of Soils Salinity with Landsat 8 OLI Imagery and Random Forest Algorithm

Teng Zhang, Zhongjing Wang, Yingfu Tang, and Yujia Shi

Soil salinity mapping is essential for sustainable land development and water resources management. In situ sampling is time-consuming, laborious, and restricted by geographical conditions. Therefore, an efficient and accurate model is necessary to monitor and assess the spatio-temporal dynamic salinization at regional a scale. In this study, Shule River Basin (SLRB) is taken as an example to develop the soil salinity mapping model based on Landsat 8 OLI images using random forest (RF) algorithms. A series of extended soil salinity indexes (ESSIs) were generated by combining any two, three, or four spectral bands were combined in expressions that include one or more of the arithmetic operations: addition, subtraction, multiplication, division, square and rooting form. The features selected from ESSIs outperformed the features selected from soil salinity indexes (SSIs) used in references. The best selected indexes are (B7^2-B5^2)^0.5, (B4^2+B5^2-B6^2)^0.5, (B1*B5-B4*B6/(B1*B5+B4*B6))^0.5,(B2*B6-B3*B7/( B2*B6+B3*B7))^0.5. In addition, three partition sampling methods of the training set and validation set for long-tail distribution problems are compared. The results showed that the resampling method considering the long-tail distribution performs better than systematic resampling and random k-fold cross-validation. The regional soil salinity mapping results showed that most areas are seriously salt-affected in the whole basin, especially along the river and the southeast mountainous area, where the soil salinity classes are highly and even over-extremely saline. This study could have implications for agricultural schemes planning and salinization control.

How to cite: Zhang, T., Wang, Z., Tang, Y., and Shi, Y.: Mapping of Soils Salinity with Landsat 8 OLI Imagery and Random Forest Algorithm, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4057, https://doi.org/10.5194/egusphere-egu23-4057, 2023.

EGU23-4412 | ECS | Orals | GI6.2

An Application of UAV in Open-pit Gold Deposit Geological Field Mapping

Jiayi Wang, Kunfeng Qiu, and Jianan Fu

Unmanned Aerial Vehicle (UAV) can greatly improve the geological field mapping. However, applications of UAV in the investigations of the deposit still remain to be explored. The Liba gold deposit, located at the Li-Min gold belt to the western Qinling orogenic belt, is a typical open-pit gold deposit. The associated (local) landscape and geomorphology provide an excellent natural laboratory to explore the UAV in deposit field mapping. Here, UAV-based photogrammetry was performed to get the aerial photos across the mining area, as well asoutcrop information from the Liba gold mine. In the combination with a detailed field work, alteration zones with the regional faults can be efficiently interpreted and evaluated, both from the macro- to micro scale. According to the work, we established a general working flow of the usage of UAV deposit field exploration to improves the field work. By demonstrating the UAV-based technical applied in Liba, this work can strongly promote the understanding and interpretation of regional geology during the field work.

Key words: Open-pit Gold Deposit, Liba gold deposit, UAV-drone photogrammetry, Geological field mapping

How to cite: Wang, J., Qiu, K., and Fu, J.: An Application of UAV in Open-pit Gold Deposit Geological Field Mapping, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4412, https://doi.org/10.5194/egusphere-egu23-4412, 2023.

EGU23-4516 | Posters on site | GI6.2

Drones Paired with Hyperspectral Imaging Paired with LiDAR to Locate Explosive Ordnance

Alexandra Restrepo, Aya Labnine, Rocco DiMatteo, Colin Edwards, Jamali Hamilton, Luis Quinto, Madison Tuohy, Alex Nikulin, and Timothy S. de Smet

Anti-personal/tank landmines, improvised explosive devices (IED), unexploded ordinances (UXO), and other abandoned explosive ordinances (EO) all pose long-lasting threats that are detrimental to areas of conflict. From 2015 to 2021, a total of 49,050 deaths/injuries were caused by EOs, and this number is only increasing. Current demining methods heavily rely on ground-based electromagnetic-induction (EMI); however, this method is costly, time consuming and puts personnel at risk. Recent advances in drone and remote sensing technology have allowed for the development of alternative remote methods to improve the efficiency in locating EOs. We used a Velodyne VLP-16 light detection and ranging (LiDAR) sensor attached to a DJI Matrice 600 drone platform to remotely identify EOs, specifically PFM-1 and VPMA-3 anti-personnel mines, TM-62M anti-vehicle mines, and 3 meter long 122 mm multibarrel rockets (MBRL). LiDAR data was acquired in dual return acquisition mode at 300 rpm and a flight speed of 1 m/s. Several of these EOs are being used in the current Russo-Ukrainian war, including: TM-62 anti-vehicle mines, PFM-1 landmines, and the MBRL rockets. Our LiDAR sensor was calibrated with a 18 m swath width to acquire 4630 points/m2 density and a 1.7 cm footprint resolution. The LiDAR data that was collected was post-processed to produce various derivative data such as: 3D point clouds, digital elevation models (DEM), digital surface models (DSMs), and derivative data products such as the total horizontal derivative (THD) filter. Processed data highlighted lateral spatial heterogeneity, which identified vertical and horizontal MBRLs, as well as surficial TM-62M anti-vehicle, TM62P anti-personnel mines and VPMA-3 landmines. PFM-1 landmines, the smallest of all EOs used, were not located, as the footprint resolution of the data collected was too small (1.7 cm) to clearly differentiate the ordinance from the environment. This pilot study allowed us to better understand the strengths and weaknesses of this method. We plan to further develop this technology by exploring the use of streamlined algorithms, applying alternative data processing workflows, and using sub-pixel techniques to improve the accuracy and efficiency of location. Refining data acquisition parameters, such as the speed and height of drone flight may also lead to further improvements in efficiency. In addition to location, a focus could also be placed on looking at intensity to identify material properties of EOs.

How to cite: Restrepo, A., Labnine, A., DiMatteo, R., Edwards, C., Hamilton, J., Quinto, L., Tuohy, M., Nikulin, A., and de Smet, T. S.: Drones Paired with Hyperspectral Imaging Paired with LiDAR to Locate Explosive Ordnance, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4516, https://doi.org/10.5194/egusphere-egu23-4516, 2023.

EGU23-7272 | Posters on site | GI6.2

Automatic Detection of UAV GCP Targets Using Line-Based Approach

Junho Yeom, Aisha Javed, Taeheon Kim, and Youkyung Han

With the advent and development of UAV technologies, UAV images are widely used in various fields since UAV photogrammetry has many advantages in terms of cost and accessibility. In addition, UAV photogrammetry has the advantage of enabling precise 3D surveying because it acquires images of higher spatial resolution with higher overlap compared to traditional aerial photogrammetry. UAV photogrammetry requires ground control points (GCPs) that are dense and evenly distributed throughout the study area. GCP surveying is generally conducted on-site, unlike automated UAV flight and image acquisition, which is a primary factor hindering time and labor cost reduction. In addition, pre-processing, such as UAV orthophoto, point cloud data, and digital elevation model (DEM) production, is performed automatically according to designated parameters, whereas matching GCP survey information with the images involves the intervention of an analyst. Therefore, in this study, the automatic extraction of UAV GCP targets and their centroids was investigated to increase the utilization of UAV photogrammetry and reduce the cost. Sequential steps of image thresholding, boundary detection, and buffered labeling detected a candidate area where ground targets exist. Then, the Hough transform was applied to the target candidates to extract two dominant lines and their intersection point representing the target center. The proposed method extracts the GCP targets from the images with high accuracy, and it was confirmed that it could be applied to complex urban areas. In addition, the GCP targets and their centroid points were successfully extracted from various land covers.

How to cite: Yeom, J., Javed, A., Kim, T., and Han, Y.: Automatic Detection of UAV GCP Targets Using Line-Based Approach, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7272, https://doi.org/10.5194/egusphere-egu23-7272, 2023.

EGU23-7531 | Orals | GI6.2

Assessment of the Faidherbia albida effect on millet yield using UAV images analysis and geostatistical techniques

Serigne Mansour Diene, Romain Fernandez, Eric Goze, Ibrahima Diack, Marième Faye, Al Housseynou Dabo, Pape Oumar Ba Bousso, Alain Audebert, Olivier Roupsard, Louise Leroux, Modou Mbaye, Abdou Aziz Diouf, Moussa Diallo, and Idrissa Sarr

Agroforestry, the association between trees/shrubs and crops, a widespread practice in West Africa, is presented as a lever for ecological intensification to optimize cereal yields in the face of strong population growth and the fight against climate change. Within the framework of the EU-DESIRA SustainSAHEL project, we aim to develop techniques to spatially assess the effect of trees on millet yields on an intra-field scale using imagery from an UAV equipped with a multispectral camera combined with geostatistical approaches. Indeed, recent advances in earth observation technologies position the UAV as an effective tool for evaluating the agronomic performance of agroforestry systems and for taking into account the intra-field variability of yields caused by environmental conditions, agricultural practices or the presence of trees (Roupsard and al., 2020 ; Leroux and al., 2022). The objective of this study was to estimate millet yields intra-field variability using UAV and up-to-date geostatistical approaches.

The study was carried out over the 2018-2022 cropping seasons in one representative Faidherbia parkland of the groundnut basin of Senegal. To that end, a Random Forest (RF) algorithm was first calibrated to estimate millet yield at sub-plot scale using a thresholding classification to eliminate non-vegetation elements and also to integrate texture data, in order to take into account the spatial relationships between pairs of pixels. Millet yields data and vegetation and textural index from aerial images at a flight height of 25 meters acquired in farmers’ plots were used to calibrate the RF model. The RF model was used to upscale yield at the whole field scale thus allowing to obtain a map of millet yield. Then Voronoï diagram, with Faidherbia as a reference, was applied to each yield map, considering each Voronoï region as a zone of influence of its included Faidherbia. We then applied a transformation and rotation matrix to overlay all the zones of influence of a population of 50 Faidherbia by putting all the trees at the same geographical position. Finally, we build an atlas, which is an average structure representative of a population and which makes possible to detect the patterns and properties of the evolution of the population considered, to evaluate the distance and directional effect of Faidherbia on vegetation index of the population and then on millet yield.

The RF model is able to explain between 70 and 90 % of the millet yield variability. Then the analysis has shown that the tree has an influence on the millet stand density with a distance-decay effect from the tree. This stand density is about 60 % around the tree and 30 % at 15m from the tree.

Key words : Agroforestry, Uav, Machine learning, Image analysis, Geostatistics, Atlas

How to cite: Diene, S. M., Fernandez, R., Goze, E., Diack, I., Faye, M., Dabo, A. H., Bousso, P. O. B., Audebert, A., Roupsard, O., Leroux, L., Mbaye, M., Diouf, A. A., Diallo, M., and Sarr, I.: Assessment of the Faidherbia albida effect on millet yield using UAV images analysis and geostatistical techniques, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7531, https://doi.org/10.5194/egusphere-egu23-7531, 2023.

EGU23-8617 | Orals | GI6.2 | Highlight

Ultra High-Resolution terrestrial and marine DEMs drive Relative Sea Level Rise projections and flooding scenario for 2100 A.D. for the Island of Panarea (Southern Tyrrhenian Sea, Italy)

Marco Anzidei, Fawzi Doumaz, Alessandra Esposito, Daniele Trippanera, Antonio Vecchio, Massimo Fabris, Alessandro Bosman, and Tommaso Alberti

In the Aeolian Archipelago (southern Tyrrhenian Sea), Panarea Island and the islets of Bottaro, Lisca Bianca, Lisca Nera and Dattilo, is undergoing sea level rise, land subsidence, coastal erosion and beach retreat that are posing continuous threats to coastal stability and infrastructures built along the coastal zone. With the aim to assess the coastal changes by the end of 2100 according to the IPCC climatic scenarios, that predict a global sea level rise even more than 1 m, a detailed evaluation of the potential coastal flooding has been estimated in the frame of the PANDCOAST project, funded by the INGV.

This work focuses on the use of Unmanned Aerial Vehicles (UAVs) imagery combined with multibeam bathymetry data collected in different years for the generation of the very high-resolution Digital Terrain and Marine Model (DTMM) of the Panarea Island and its archipelago. Scenarios are based on the determination of the current coastline position, high resolution Digital Terrain and Marine Models, vertical land movements and climatic projections. The data fusion of detailed topographic data, up to 2 cm/pixel for the subaerial sector with sea level rise projections released by the Intergovernmental Panel on Climate Change (IPCC) for the SSP2.6 and SSP5 climatic scenarios for this area, are used to map the expected multi-temporal sea level rise scenarios for 2050 and 2100.

In the analysis have been incorporated the effects of the vertical land movements (VLM) as estimated by the Global Navigation Satellite System (GNSS) network located in the archipelago. Assuming constant rates of VLM for the next 80 years, relative sea level rise projections provide values between 31±11 cm by 2050 and 104±27 cm by 2100 for the IPCC AR6 SSP8.5 scenarios and at 27±10 cm by 2050 and 73±24 cm by 2100, for the IPCC AR6 SSP2.6 scenario, with small variations in the individual islets of the archipelago. With these scenarios, the lowest elevated coasts of the islands are exposed to increasing marine flooding, especially during storm surges and high water levels particularly heavy from the north-western sectors.

How to cite: Anzidei, M., Doumaz, F., Esposito, A., Trippanera, D., Vecchio, A., Fabris, M., Bosman, A., and Alberti, T.: Ultra High-Resolution terrestrial and marine DEMs drive Relative Sea Level Rise projections and flooding scenario for 2100 A.D. for the Island of Panarea (Southern Tyrrhenian Sea, Italy), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8617, https://doi.org/10.5194/egusphere-egu23-8617, 2023.

EGU23-10468 | Orals | GI6.2 | Highlight

UAS applications in high-resolution topographic change, land use classification, and sub-surface geophysical mapping

Mel Rodgers, Rocco Malservisi, Robert Van Alphen, Taha Sadeghi Chorsi, Timothy Dixon, and Charles Connor

The use of unoccupied aerial systems (UAS) in geoscience has dramatically improved our ability to collect data at high resolution, minimal cost, and in rapid response to sudden events. The wide range of sensor and platform configurations gives scientists great flexibility in survey design and data collection. Satellite remote sensing data has exceptional spatial coverage and continues to increase its data acquisition to meter-level resolution. UAS data can image to the cm-level resolution but lacks the same spatial coverage as satellite. By combining and comparing UAS data with satellite and ground-based remote sensing data we can utilize the different strengths of these systems. Here we demonstrate various UAS applications in high-resolution topographic change, land use classification, and sub-surface geological mapping. We use UAS payloads such as RTK georeferenced RGB and multispectral images, lidar, and magnetic sensors to image surface changes and sub-surface structures. We demonstrate the need for post-processing (PPK) high precision GNSS rover locations over utilizing only RTK position information.

Florida, USA, is home to rapidly changing beaches and wetlands, which are highly susceptible to our changing climate and destructive storm events. We show examples from beaches and wetlands in Pinellas County, Florida, USA where we have a) imaged the emergence and development of a barrier island, b) developed automated land use classification using photogrammetry and multispectral data, c) evaluated the impacts of a major hurricane event on a recently renourished beach. Pacaya Volcano, Guatemala, is an active volcano with frequent lava flows and historical flank collapse events. Using a combination of satellite DEMs, ground-based Terrestrial Radar Interferometry data, and UAS RGB SfM-photogrammetry, we have imaged recent lava flows in high-resolution showing details of lava flow levees and other structures. By comparing our data to pre-eruption satellite DEMs we can evaluate the volume and morphology of recently emplaced lava flows. In addition, we have collected magnetic data over recent lava flows that allows us to image the sub-surface structure of the lava flows and model lava flow properties. UASs are a powerful tool for remote sensing, geodetic, and geophysical data collection. They augment satellite and ground-based methodologies and by combining multidisciplinary data from these platforms we can image the earth in greater spatial and temporal detail than ever before.

How to cite: Rodgers, M., Malservisi, R., Van Alphen, R., Sadeghi Chorsi, T., Dixon, T., and Connor, C.: UAS applications in high-resolution topographic change, land use classification, and sub-surface geophysical mapping, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10468, https://doi.org/10.5194/egusphere-egu23-10468, 2023.

EGU23-11014 | ECS | Posters on site | GI6.2

A study on classification and monitoring of marine debris using multi-spectral images and deep learning

You Chul Jeong, Jong-Seok Lee, Jisun Shin, and Young-Heon Jo

Marine environmental issues due to marine debris are worldwide phenomena. According to a press release from the Ministry of Oceans and Fisheries of Korea, marine waste collection from the coastal area increased yearly. In 2020, it collected 1.38 million tons, about 45% more than in 2018. To remove them, they were collected and monitored through field monitoring systems. However, it is very inefficient in terms of time and cost. Therefore, the remote sensing approach can be suited for classifying and investigating marine waste dumped in coastal areas. Previous studies have classified marine waste by combining remote sensing based on RGB images and artificial intelligence. However, actual marine waste is often damaged, or its shape is difficult to recognize through RGB images. This study was conducted to classify various wastes using multi-spectral camera and a convolution neural network (CNN) model. We first trained and tested CNN model using three wastes, such as a brown paper box, an orange-colored buoy, and a blue plastic basket with different spectral characteristics in the land environment. Then, we conducted the classification of marine waste using CNN model and multi-spectral images taken with Uncrewed Aerial Systems (UAS) in the marine environment around Socheongcho-Ocean Research Station (S-ORS). The CNN model were trained using 1,452 seawater and 1,319 clear plastic images around the S-ORS with 128 x 128-pixel size. We calculated precision, recall, f1-score, and accuracy, suggesting that the CNN model could be used to classify various marine wastes in the various ocean environment. Overall, these results can provide useful information for marine waste monitoring.

How to cite: Jeong, Y. C., Lee, J.-S., Shin, J., and Jo, Y.-H.: A study on classification and monitoring of marine debris using multi-spectral images and deep learning, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11014, https://doi.org/10.5194/egusphere-egu23-11014, 2023.

EGU23-11869 | ECS | Posters on site | GI6.2

How UAV improve past metallurgical deposits characterization for landfill regeneration

Hadrien Michel, Marc Dumont, David Caterina, François Jonard, Itzel Isunza Manrique, Tom Debouny, and Frédéric Nguyen

Ancient metallurgical industries produced large amounts of residues, which were typically deposited in heaps or tailing ponds. The presence of such wastes could represent a potential source of pollution that may prevent the reuse of the sites. The NWE-REGENERATIS project aims to characterize different types of metallurgical deposits in order to improve their management and rehabilitation. The understanding of these sites is made difficult by their heterogeneous composition, complex morphology and dense vegetation.

Here, we explore the interest of integrating UAV surveys in geophysical characterization of NWE-REGENERATIS sites. First, our approach uses photogrammetry to build the digital surface model. Such models can be used to approach deposit volume and improve modelling of the sites. Those are crucial to carry accurate inversion of land-based geophysical data. Secondly, the multi-spectral measurements allow characterizing surface geochemical composition in order to define surface waste characteristics. These data could be used to explain surface electrical resistivity variation. Finally, areas with high metallurgical contents are highlighted with magnetic mapping. There, the ability of UAV to cover areas previously unattainable by land (dense vegetation and/or steep inclines) is key for a better understanding of the site.

This methodology is applied to multiple sites, including old iron and zinc factories or uncharacterized industrial landfill. We thus present strengths and weaknesses of each UAV mapping used to characterize metallurgical landfills.

How to cite: Michel, H., Dumont, M., Caterina, D., Jonard, F., Isunza Manrique, I., Debouny, T., and Nguyen, F.: How UAV improve past metallurgical deposits characterization for landfill regeneration, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11869, https://doi.org/10.5194/egusphere-egu23-11869, 2023.

EGU23-12229 | ECS | Posters on site | GI6.2

Real surface vegetation functioning and early stress detection using visible-NIR-thermal sensor synergies: from UAS to future satellite applications

Adrián Moncholi, Shari Van Wittenberghe, Maria Pilar Cendrero-Mateo, Luis Alonso, Marcos Jiménez, Katja Berger, Alasdair Mac Arthur, and José Moreno

Under the current climate change conditions, the early stress detection of crops and worldwide vegetation are crucial to promote sustainable agriculture and ecosystem management. With the upcoming European Space Agency’s Fluorescence Explorer-Sentinel 3 (FLEX-S3) tandem mission, vegetation fluorescence and the auxiliary parameters/traits needed to interpret solar-induced vegetation fluorescence (SIF) will become available at 300x300 m spatial resolution. Today, a variety of SIF-specialized UAS systems exist to retrieve the canopy-emitted SIF over larger areas, e.g., as a reference for airborne imaging SIF sensors. However, they lack the complementary sensors needed for a correct interpretation of the highly dynamic fluorescence emission. In this study we present the FluoCat system, a unique UAS system which can be mounted either in a UAV or cable-suspended mobile platform. On board the FluoCat are mounted: a high-spectral resolution Piccolo Doppio dual spectrometer system, a MAIA-S2 multispectral camera and a TeAx Thermal Capture Fusion camera, which can be triggered simultaneously according to a pre-set protocol. The FluoCat system mimics the FLEX-S3 sensor configuration, by using a multi-sensor system integrating the visible, NIR and thermal spectral regions providing complete datasets to assess the actual vegetation stress. In this context a field campaign was conducted in the experimental site ‘Las Tiesas’ in Barrax, Spain, with the aim to (1) apply sampling protocols to obtain spatially representative canopy reflectance and SIF measurements, and (2) provide accurate ground truth measurements for real (i.e., leaf) surface reflectance and effective surface fluorescence measurements, linkable to the real photosynthetic performance. Further we demonstrate the development of a sensor synergy product, combining canopy physiological and structural information to reveal real surface physiological stress-related energy emission. The ‘sunlit green fluorescence’ is a synergy product combining the top-of-canopy fluorescence and the fractional vegetation cover of the sunlit vegetation. This synergy product improved the estimation of the effective surface fluorescence flux, using the leaf fluorescence emission as reference, by reducing the errors from 36 % to 18 % (band 687 nm); and from 24 % to 6 % (band 760 nm). Real surface properties and products referring to the actual photosynthetic surface behavior are promising quantitative proxies to assess the impact of climate change and/or management practices on crop lands or even whole ecosystems. With this study we show how innovative proximal sensing platforms can help to develop new data processing schemes combining all required information for the quantitative assessment of vegetation health, even before visible damage occurs. The further processing and normalization of first-derived stress proxies such as SIF can generate further in-depth early stress detection, directly related to the photosynthetic light reactions, and further global carbon assessment. These developments are in direct support for the global monitoring of early vegetation stress under a changing global climate.

How to cite: Moncholi, A., Van Wittenberghe, S., Cendrero-Mateo, M. P., Alonso, L., Jiménez, M., Berger, K., Mac Arthur, A., and Moreno, J.: Real surface vegetation functioning and early stress detection using visible-NIR-thermal sensor synergies: from UAS to future satellite applications, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12229, https://doi.org/10.5194/egusphere-egu23-12229, 2023.

EGU23-12890 | Orals | GI6.2 | Highlight

Stromboli surface changes from Pleiades high-resolution space data

Claudia Spinetti, Marina Bisson, and Monica Palaseanu-Lovejoy

Stromboli is one of the most visited volcanoes in the world due to its persistent activity consisting in mild strombolian explosions with a frequency up to 25-30 events per hour. This activity is punctuated by more energetic explosions named major explosions, paroxysms and lava flow. These types of eruption can change drastically the morphology of the affected areas and cause volcanic phenomena highly impacting for the island, including heavy fallout of blocks and bombs on the flanks of the volcano, pyroclastic flows and tsunami waves. Paroxysms are highly dangerous phenomena for the tourists that climb the volcano and can cause serious problems also to the local people living on the two villages on the coast of the island. In order to map the areas affected by morphological changes, the thickness of deposits and the associate volume estimation of erupted products, we propose a study based on two techniques of remote sensing. First, we reconstruct the Stromboli topography, before and after an event, elaborating stereo pairs of Pleiades satellite and using as base an airborne LiDAR data at spatial resolution of 50 cm. Then we map the morphological changes giving an estimation of the relative areas and volumes. These results, discussed and compared with available field data, can help to better understand the impact of the event and provide indications useful in a territory planning aimed to mitigate the effects of such calamitous events.

How to cite: Spinetti, C., Bisson, M., and Palaseanu-Lovejoy, M.: Stromboli surface changes from Pleiades high-resolution space data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12890, https://doi.org/10.5194/egusphere-egu23-12890, 2023.

EGU23-13115 | ECS | Posters on site | GI6.2

Surface temperature variations observed from a thermal infrared camera mounted on a hovering UAV platform

Jamal Elfarkh, Kasper Johansen, Victor Angulo-Morales, Omar Lopez Camargo, and Matthew F. McCabe

Land surface temperature (LST) is crucial information that helps to understand and assess the interactions between the surface and the atmosphere. LST is a key parameter used in various applications including studies of irrigation, water use, vegetation health, urban heat island effects, and building insulation. In addition to several satellites that provide periodic images of surface temperature, unmanned aerial vehicle (UAV) platforms have been adapted to obtain higher spatio-temporal resolution thermal infrared (TIR) data. In fact, numerous research studies have investigated the accuracy and the processing method of UAV-based TIR images given its complexity and sensitivity to ambient conditions. However, the surface temperature is characterized by continuous and rapid variation over time, which is difficult to take into consideration in the processing of UAV-based orthomosaics. Here, we quantify this variation and discuss the environmental factors that lead to its amplification. Thermal images were collected over a fixed hovering position during periods of 15-20 min, representing the common duration of UAV flights. At different times of the day, we flew at different altitudes over sand, water, grass and olive trees. Before the quantification of the surface temperature variation, the thermal infrared data were evaluated against field-based measurements using calibrated Apogee sensors. The evaluation showed a significant error in the UAV-based thermal infrared data linked to wind speed, which increased the bias from -1.02 to 3.86 °C for 0.8 to 8.5 m/s winds, respectively. The assessment of the LST values collected over the different surfaces showed a temperature variation while hovering ranging between 1.4 and 5 °C. In addition to wind effects, temperature variations while hovering were strongly linked to solar radiation, specifically radiation fluctuations occurring after sunrise and before sunset. This research provides insights into the LST variation expected for standard UAV flights of 15-20 min under different environmental conditions, which should be taken into account during UAV-based thermal infrared data processing and may help interpret and quantify inconsistencies in UAV-based orthomosaics of LST.

How to cite: Elfarkh, J., Johansen, K., Angulo-Morales, V., Lopez Camargo, O., and F. McCabe, M.: Surface temperature variations observed from a thermal infrared camera mounted on a hovering UAV platform, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13115, https://doi.org/10.5194/egusphere-egu23-13115, 2023.

EGU23-14035 | Orals | GI6.2

Up-scaling approach to monitor pests in Alpine forests: A case study in Vinschgau, South Tyrol, Italy.

Abraham Mejia-Aguilar, Alexandros Theofanidis, Emilio Dorigatti, Ruth Sonnenschein, Ekaterina Chuprikova, and Liqiu Meng

Endemic pests are a fundamental part of forest ecosystems, they provide key ecosystem services such as nutrient cycling and support biodiversity. Still, massive outbreaks of these pests, triggered by events such as drought, windthrows, and snow breaks, can limit the provisioning of ecosystem services that are key for human populations such as water cycle regulation and, which can eventually trigger natural hazard events (e.g. landslides).

Unmanned Aerial Vehicles (UAVs) and miniaturized optical sensors can be used to support foresters in detecting, identifying, and quantifying pests and their diffusion by exploiting multispectral imagery at high resolution. Such platforms are especially suited for monitoring areas in mountain regions that are difficult to access.

This study focus on the pine processionary (Thaumetopoea pityocampa) and European bark beetle (Ips typographus) that affect many forests in the Province of South Tyrol, Italy. Here, we present an up-scale strategy that first identifies the presence of a pest at the centimeter level (ground and close-range scale) based on UAV-derived products on a plot level. We conducted three UAV-flight campaigns during the year corresponding to the insect-life cycle. Then, on the one hand, using simple RGB and NDVI mosaics the system delineates the trees, identifies nests (processionary) and quantifies their impact. On the other, using the NDVI time series collection the system classify healthy, infested or dead tree linked to the presence of bark beetle. The system classifies and quantifies its presence by presenting graduated symbol maps widely used by foresters. Then, we scale up to meter resolution (remote sensing scale) to detect changes due to certain conditions of stress that can link to the presence of the studied pests. The final aim is to create high-quality training datasets that will be exploited by remote sensing products (Sentinel) to study and cover wider areas.

How to cite: Mejia-Aguilar, A., Theofanidis, A., Dorigatti, E., Sonnenschein, R., Chuprikova, E., and Meng, L.: Up-scaling approach to monitor pests in Alpine forests: A case study in Vinschgau, South Tyrol, Italy., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14035, https://doi.org/10.5194/egusphere-egu23-14035, 2023.

EGU23-14165 | Posters on site | GI6.2 | Highlight

The integrated use of LiDAR and photogrammetric techniques by the UAS platform for the mapping of rockfall processes in Ischia Island (Italy)

Vincenzo De Novellis, Massimiliano Alvioli, Andrea Barone, Antonello Bonfante, Maurizio Buonanno, Raffaele Castaldo, Ada De Matteo, Federica Fiorucci, Susi Pepe, Paola Reichenbach, Michele Santangelo, Giuseppe Solaro, Pietro Tizzani, and Andrea Vitale

The following work focuses on the surveys that were carried out using optical sensors (photogrammetry) and LiDAR mounted on UAS platforms. The processing of the acquired images provided the necessary information for the development of high-precision digital terrain models that can be used as a basis for the subsequent modeling of the stability analysis of collapse phenomena with STONE, a three-dimensional rockfall simulation model. These surveys allowed us to localize the possible detachment sources and the inclusion of scenario-based seismic shaking as a trigger for rockfalls.

The areas filmed fall almost exclusively along the north-western slope of Mt. Epomeo and more precisely in the areas identified as locality Falanga (32 ha) and locality Frassitelli (123 ha) in the territory of the Municipality of Forio (Napoli) and only marginally in the Municipality of Serrara Fontana (Napoli). The slope surveyed has two distinct morphologies: 1) the north-west oriented sector (Falanga) delimited by extremely steep walls and by cliffs with variable vertical development, at the base of which there is a large sub-flat area delimited to the north by a new sudden jump in slope; 2) in the west sector (Frassitelli) the slope is instead more rounded, even if in various points there are areas with steep slopes and strongly fractured cliffs; this side is characterized by the presence of numerous tuff blocks, even of large dimensions, which have stopped at various altitudes after having detached themselves from the overlying sub-vertical walls.

We also used data from the Geoportale Nazionale Italiano managed by the Ministry of Environment and provided different kinds of spatial data. In particular, the archive contains an extensive LiDAR survey covering a substantial portion of Italy, with data stored at the intermediate processing level. For this research, we selected point clouds covering the Ischia island and we interpolated separately the two point clouds, using the module specifically designed to perform surface interpolation from vector points mapped by splines, within the GIS platform.

In conclusion, we interpreted the point-to-point difference between DSM and DTM as due to vegetation and exploited this information to infer modifications of ground parameters relevant to the simulations with Stone. We partially took into account disturbances due to the presence of anthropic structures and buildings using additional land cover data, which we correlated with point-to-point DSM – DTM differences.

How to cite: De Novellis, V., Alvioli, M., Barone, A., Bonfante, A., Buonanno, M., Castaldo, R., De Matteo, A., Fiorucci, F., Pepe, S., Reichenbach, P., Santangelo, M., Solaro, G., Tizzani, P., and Vitale, A.: The integrated use of LiDAR and photogrammetric techniques by the UAS platform for the mapping of rockfall processes in Ischia Island (Italy), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14165, https://doi.org/10.5194/egusphere-egu23-14165, 2023.

EGU23-15267 | ECS | Orals | GI6.2

The NERC Field Spectroscopy Facility UAV Suite

Robbie Ramsay, Alex Merrington, Jack Gillespie, and Steven Hancock

The Field Spectroscopy Facility (Edinburgh, UK) is a Natural Environment Research Council public funded body which maintains and provides cutting edge spectroscopy instrumentation and expertise to UK and international researchers. The facility primarily focuses on the provision of ground based spectroscopic instrumentation, often in support of airborne spectroscopic surveys, but was awarded a UKRI capital fund in 2019 for the development of a UAV spectroscopic sensor suite to fill the spatial resolution gap between airborne and ground measurements.

Developed as the “NERC Field Spectroscopy Facility UAV Suite”, the new instrument pool consists of various UAV platforms and spectroscopic sensors which can be loaned to UK and international researchers. Instruments include multispectral cameras with sensors matched to Sentinel-2 and WorldView-3 centre wavelengths; thermal cameras covering the SWIR to MIR region; a custom designed UV-VIS spectrometer for measurements of solar induced fluorescence; and the flagship sensor of the suite, a lightweight hyperspectral imager with LIDAR attachment covering the UV-VIS-SWIR region (350 to 2500 nm range).

In this presentation, we discuss the development of the FSF UAV suite, discussing our “chain” concept of development – calibration of sensors at our optical laboratory; integration of sensors onto UAVs; logistical planning of flights with associated ground-based data acquisition; and the development of custom processing chains of UAV acquired data. We will highlight select campaigns on which the UAV suite has been used, including macro plastic detection as part of ESA HyperDrone, ecological surveying of large peatlands in Northern Scotland, and support for the ESA-FLEX (solar induced fluorescence sensing) mission. We will also discuss the challenges involved in sensor integration, and provide insight into the novel solutions which we have employed during the development of the UAV suite.

How to cite: Ramsay, R., Merrington, A., Gillespie, J., and Hancock, S.: The NERC Field Spectroscopy Facility UAV Suite, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15267, https://doi.org/10.5194/egusphere-egu23-15267, 2023.

EGU23-15546 | ECS | Posters virtual | GI6.2

Assessment of transpiration in different almond production systems with two-source energy balance models using high-resolution aerial imagery.

Manuel Quintanilla-Albornoz, Joaquim Bellvert, Ana Pelechá, Jaume Casadesus, Omar García-Tejera, and Xavier Miarnau

The almond production has increased by doubling their hectares under irrigation treatments in Spain. In a context of water scarcity, the estimation of Evapotranspiration (ET) and its components, Transpiration (T) and Evaporation (E), are key variables to monitor and manage the water resources. High-resolution ET can be retrieved from surface energy flux modeling, such as a Two-Source Energy Balance (TSEB) model, using an Unmanned Aerial System (sUAS). sUAS equipped with Thermal and Multispectral cameras allows us to obtain the main parameters required in TSEB. Currently, there are no studies that evaluate the T obtained with TSEB Priestley Taylor (TSEB-PT) and TSEB-2T models in tree-scale almonds under different irrigation treatments (IR) and production systems (PS). In this context, we evaluated the T retrieved with TSEB-PT and TSEB-2T models using Sap Flows sensor in trees with three PS, Open Vase with Minimal Pruning (OVMP), Central Axis (CA) and Hedgerow (HGR), and three levels IR, Full Irrigation (FI), Mild Stressed (MS) and Stressed (SS). Five flights were conducted from March 2021 to July 2021 to analyze the almond growing season with an aircraft equipped with a thermal and multispectral camera. Leaf area index (LAI), stem water potential (Ψ_stem) and Fractional Intercepted Photosynthetically Active Radiation (fIPAR) was also measured concomitant to image acquisition. PS presents significant differences in fractional canopy cover (F_C), tree height (H_C), LAI and Sap Flow transpiration (Tsf). The two TSEB models show a generalized overestimation with a BIAS of 0.99 and 1.22 for TSEB-2T and TSEB-PT respectively. TSEB-PT presented worse statistics and R2 decreases in the more intensive production system. HGR has equal or greater LAI but lower F_C, which would imply an overestimation of canopy temperature (T_C) by the PT method. This is in addition to the difficulty of setting the PT coefficient according to the context of the crop. The overestimation in both models could be associated with an error in Campbell (1998) Radiative Transfer Model used to estimate transmittance, which has an error of 0.14 RMSE and 0.12 BIAS compared with fIPAR. Our results suggest the use of TSEB-2T with high resolution images considering the current available technology that allows us to estimate T_C and T_S separately, especially in intensive or super-intensive almond crops. To improve the T estimation, it is recommended to use in situ PAR measurement to decrease the influence of LAI measurements on the models.

How to cite: Quintanilla-Albornoz, M., Bellvert, J., Pelechá, A., Casadesus, J., García-Tejera, O., and Miarnau, X.: Assessment of transpiration in different almond production systems with two-source energy balance models using high-resolution aerial imagery., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15546, https://doi.org/10.5194/egusphere-egu23-15546, 2023.

EGU23-16009 | Orals | GI6.2

Common mission planning and situation awareness model for UxS Command and Control systems

Teodor Hanchevici, Piotr Zaborowski, Donald V. Sullivan, and Alex Robin

Multi-vendor operations of uncrewed vehicles as part of the observations, surveillance and surveying are already daily practice in many fields. The popularity of the integration platforms that manage multiple, sometimes simultaneously, systems is also already proven by the integration platforms' popularity. With new European regulations for the drone industry and the growing popularity of various (ground, water surface, underwater, aerial) systems exploitations, the need for situation awareness and planning that will be flexible and vendor lock-in free is leveraged. However, despite several recent efforts and some popular specifications that aim at becoming de-facto standards, civil operations' interoperability challenge is unsolved. To assess whether a shared data model is suitable for multi-domain, multi-heterogeneous vehicle use, and challenge it with real applications and demonstrate the exchange of command and control information, OGC members started an Interoperability Experiment in 2022. IE is based on a data model developed by Kongsberg Geospatial and partners under the Standards-based UxS Interoperability Test-bed (SUIT). The IE considers those other standards and specifications which were used in the SUIT work as well as other Command and Control practices from the aviation and marine communities. The presentation depicts selected use cases and scenarios and outlines the information model of the localized situation awareness and mission planning and operations. Being specific for autonomous vehicle operations, they extend the needs of generic geospatial representations. Authors will explain relations to other similar models like (LSTS, MavLink, UMAA, STANAG 4586, JAUS, C2INav) and modern geospatial data exchange standards like OGC SensorThings, Features, Moving Features, GeoPose.

How to cite: Hanchevici, T., Zaborowski, P., Sullivan, D. V., and Robin, A.: Common mission planning and situation awareness model for UxS Command and Control systems, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16009, https://doi.org/10.5194/egusphere-egu23-16009, 2023.

EGU23-16685 | Posters on site | GI6.2

In - season progressive crop type mapping in war affected Ukraine

Josef Wagner, Inbal Becker-Reshef, Shabarinath Nair, Sergii Skakun, Yuval Sadeh, Sheila Baber, Blake Munshel, Andrew Zolli, and Françoise Nerry

The invasion of Ukraine by Russian forces was expected to have global impact on food trade and security, since Ukraine is a breadbasket cereals and oil seeds producer. The NASA Harvest « Rapid Agricultural Assessment for Policy Support » (RAAPS) team was triggered early in the conflict to provide answers to the following questions :

(i) How much winter cereals, winter oil seeds and summer crops were planted in Ukraine during the 2021-2022 cropping season?

(ii) What proportion of those crops fell under the Russian occupied area?

(iii) How much cropland was left unplanted in 2022 due to the war?

As insights had to be produced within season, the NASA Harvest RAAPS team produced the first ever, Ukraine scale in-season crop type map based on Planet Labs 3 meter spatial -, 4 bands spectral -, and daily temporal – resolution data. Since no labeled datasets were available early enough in-season for applying supervised machine learning techniques, cropland was progressively mapped into four classes (winter cereals, rapeseed, summer crops and barren/non cultivated plots), using semi-supervised clustering techniques and heuristical thresholdings. Expert domain knowledge allowed to cope with missing ground truth training data. First, active cropland was separated into winter crops and potential summer crops. K-means clustering of April and May Planet images, followed by visual cluster assignment, allowed to efficiently separate green crops (winter crops) from barren soils (potential summer crops). Then, another K-means clustering allowed to split winter crops into winter cereals and rapeseed as of end of May, based on the intense yellow flowering signal of the latter. Finally a set of NDVI based heuristics was applied on potential summer crops in order to assess if green-up happened or not. Crops which did not green up as of the 11th of July 2022 were considered barren/non-planted.

Road side ground surveyed crop type information collected in free Ukraine has been provided by Kussul & al. (2022) in August 2022. Validation against this data provided an overall accuracy of 94 % and a mean F1-score of 91 % for winter cereals, rapeseed and summer crops. No unplanted fields were collected as part of the ground campaign. Several assessments of proportional area per crop type and occupation status were performed throughout the growing season, as occupation boundaries kept moving. As of the 11th of July 2022, 23.03 % of Ukraines cropland was occupied. 55.29 % of all detected barren fields were located within occupied territories, mainly scattered around the front line. 33.9 % of all winter crops were under occupied territory when harvest ready (mid July).

This crop type map was used for computing harvested area, estimating yield and for production computation. Following NASA EarthObservatory articles were published, providing information to the public and private sector : (i) https://earthobservatory.nasa.gov/images/150025/measuring-wars-effect-on-a-global-breadbasket (ii) https://earthobservatory.nasa.gov/images/150590/larger-wheat-harvest-in-ukraine-than-expected

How to cite: Wagner, J., Becker-Reshef, I., Nair, S., Skakun, S., Sadeh, Y., Baber, S., Munshel, B., Zolli, A., and Nerry, F.: In - season progressive crop type mapping in war affected Ukraine, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16685, https://doi.org/10.5194/egusphere-egu23-16685, 2023.

EGU23-16686 | Orals | GI6.2 | Highlight

An automated GIS procedure for mapping ballistic projectiles by using UAVs imagery: the case of the 3rd July, 2019 paroxysm at Stromboli

Marina Bisson, Claudia Spinetti, Roberto Gianardi, Karen Strehlow, Emanuela De Beni, and Patrizia Landi

This study presents an application based on UAS optical data for mapping at very high spatial resolution the ballistic projectiles erupted during an explosive volcanic eruption. The novelty consists in the development of a GIS-based automate procedure that, elaborating high spatial resolution UAV optical imagery (RGB) acquired within few days from the explosive event, is able to reproduce the boundary of each ballistic projectile as georeferenced polygon feature. This procedure, applied for the first time at Stromboli volcano (Aeolian Archipelago, Italy), has reconstructed in 2D digital format the shape of the ballistic spatter clasts emplaced on the East flank of the volcano during the paroxysm of the 3rd July, 2019. The dimensions of the clasts, reproduced as polygon features stored in WGS 84 UTM 33 metric coordinates, range from 0.03 m² (16 cm x 16 cm) to 4.23 m² (~2 m x 2 m). Respect to the classic field survey, the application here presented is able to generate, in efficient and rapid way, a large amount of data and information on ballistic deposits, covering also the areas inaccessible and/or dangerous as particularly affected by ballistic fallout. Such application allowed to better understand the dynamic of ballistics emplacement, providing a useful contribution to volcanic hazard mitigation.

How to cite: Bisson, M., Spinetti, C., Gianardi, R., Strehlow, K., De Beni, E., and Landi, P.: An automated GIS procedure for mapping ballistic projectiles by using UAVs imagery: the case of the 3rd July, 2019 paroxysm at Stromboli, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16686, https://doi.org/10.5194/egusphere-egu23-16686, 2023.

EGU23-17308 | Posters on site | GI6.2

Transfer learning from citizen science photos enables plantspecies identification in UAV imagery

Salim Soltani, Hannes Feilhauer, Robbert Duker, and Teja Kattenborn

Accurate information on the spatial distribution of plant species and communities is in high demand for various fields of application, such as nature conservation, forestry, and agriculture. A series of studies has shown that CNNs accurately predict plant species and communities in high-resolution remote sensing data, in particular with data at the centimeter scale acquired with Unoccupied aerial vehicles (UAV). However, such tasks require ample training data to generate transferable CNN models. Reference data are commonly generated via geocoded in-situ observations or labeling of remote sensing data through visual interpretation. Both approaches are commonly laborious and can present a critical bottleneck for CNN applications. An alternative source of training data is given by using knowledge on the appearance of plants in the form of plant photographs from citizen science projects such as the iNaturalist database. Such crowd-sourced plant photos are expected to be very heterogeneous, and often show a different perspective compared to the typical bird-perspective of remote sensing data. Still, crowd-sourced plant photos could be a valuable source to overcome the challenge of limited training data and reduce the efforts for field data collection and data labeling. Here, we explore the potential of transfer learning from such a crowd-sourced data treasure to the remote sensing context. Therefore, we investigate firstly, if we can use crowd-sourced plant photos for CNN training and subsequent mapping of plant species in high-resolution remote sensing imagery. Secondly, we test if the predictive performance can be increased by a priori selecting photos that share a more similar perspective to the remote sensing data. Therefore, we used three case studies to test our proposed approach using multiple RGB orthoimages acquired from UAV for the target plant species Fallopia japonica (F. japonica), Portulacaria Afra (P. afra), and 10 different tree species, respectively. For training the CNN models, we queried the iNaturalist database for photos of the target species and the surrounding species that are expected in the areas of each case study. We trained CNN models with an EfficientNet-B07 backbone. For applying these models based on the crowd-sourced data to the remote sensing imagery, we used a sliding window approach with a 10 percent overlap. The individual sliding-window-based predictions were spatially aggregated in order to create a high-resolution classification map. Our results demonstrate that CNN models trained with heterogeneous, crowd-sourced plant photos can indeed predict the target species in UAV orthoimages with surprising accuracy. Filtering the crowd-sourced photos used for training by acquisition properties increased the predictive performance. This study demonstrates that citizen science data can effectively anticipate a common bottleneck for vegetation assessments and provides an example on how we can effectively harness the ever-increasing availability of crowd-sourced and big data for remote sensing applications.

How to cite: Soltani, S., Feilhauer, H., Duker, R., and Kattenborn, T.: Transfer learning from citizen science photos enables plantspecies identification in UAV imagery, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17308, https://doi.org/10.5194/egusphere-egu23-17308, 2023.

EGU23-370 | PICO | NH9.9

A climate based dengue early warning system for Pune, India

Sophia Yacob and Roxy Mathew Koll

Dengue incidence has grown dramatically in recent decades, with about half of the world’s population now at risk. Climate plays a significant role in the incidence of dengue. However, the climate-dengue association needs to be clearly understood at regional levels due to the high spatial variability in weather conditions and the non-linear relationship between climate and dengue. The current study evaluates the impacts of weather on dengue mortality in the Pune district of India, for a 15-year period, from 2001 to 2015. To effectively resolve the complexity involved in the weather-dengue association, a new dengue metric is defined that includes temperature, relative humidity, and rainfall-dependent variables such as intraseasonal variability of monsoon (wet and dry spells), wet-week counts, flushing events, and weekly cumulative rains. We find that high dengue mortality years in Pune are comparatively dry, with fewer monsoon rains and flush events (rainfall > 150 mm), but they have more wet weeks and optimal humid days (days with relative humidity between 60–78%) than low dengue mortality years. These years also do not have heavy rains during the early monsoon days of June, and the temperatures mostly range between 27–35°C during the summer monsoon season (June–September). Further, our analysis shows that dengue mortality over Pune occurs with a 2-5 months lag following the occurrence of favourable climatic conditions. Based on these weather-dengue associations, an early warning prediction model is built using the machine learning algorithm random forest regression. It provides a reasonable forecast accuracy with root mean square error (RMSE) = 1.01. To assess the future of dengue mortality over Pune under a global warming scenario, the dengue model is used in conjunction with climate change simulations from the Coupled Model Intercomparison Project phase 6 (CMIP6). Future projections show that dengue mortality over Pune will increase in the future by up to 86 percent (relative to the reference period 1980–2014) by the end of the 21st century under the high emission scenario SSP5-8.5, primarily due to an increase in mean temperature (3°C increase relative to the reference period). The projected increase in dengue mortality due to climate change is a serious concern that necessitates effective prevention strategies and policy-making to control the disease spread.

How to cite: Yacob, S. and Mathew Koll, R.: A climate based dengue early warning system for Pune, India, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-370, https://doi.org/10.5194/egusphere-egu23-370, 2023.

EGU23-570 | ECS | PICO | NH9.9

Human health as an indicator of climate change.

Moiz Usmani, Kyle Brumfield, Yusuf Jamal, Mayank Gangwar, Rita Colwell, and Antarpreet Jutla

The association of climatic conditions with human health outcomes has been known for ages; however, the impact of climate on infectious agents in disease transmission is still evolving. Climate change alters the regional weather impacting the emergence, distribution, and prevalence of infectious (vector-, water- or air-borne) diseases. Since the last few decades, the world has experienced an apparent increase in the emergence and re-emergence of infectious diseases, such as Middle East respiratory syndrome coronavirus (MERS-CoV); severe acute respiratory syndrome coronavirus (SARS-CoV); Ebola virus; Zika virus; and recently SARS-CoV-2. With many health agencies recommending handwashing, clean water access, and household cleaning as prevention measures, the threat to water security looms over the world population resulting in a significant public health burden under the lens of the emergence of infectious diseases. Under-resourced regions that lack adequate water supplies are on the verge of an enormous additional burden from such outbreaks. Thus, studying anthropogenic and naturogenic factors involved in the emergence of infectious diseases is crucial to managing and mitigating inequalities. This study aims to determine the impacts of climate variability on infectious diseases, namely water-, air-, and vector-borne diseases, and their association with the distribution and transmission of infectious agents. We also discuss the advancement of built infrastructure globally and its role as a mitigation or adaptation tool when coupled with an early warning system. Our study, therefore, will provide a climate-based platform to adapt and mitigate the impact of climatic variability on the transmission of infectious diseases and water insecurity.

How to cite: Usmani, M., Brumfield, K., Jamal, Y., Gangwar, M., Colwell, R., and Jutla, A.: Human health as an indicator of climate change., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-570, https://doi.org/10.5194/egusphere-egu23-570, 2023.

EGU23-593 | ECS | PICO | NH9.9

Variability and the odds of Total and Pathogenic Vibrio abundance in Chesapeake Bay

Mayank Gangwar, Kyle Brumfield, Moiz Usmani, Yusuf Jamal, Antar Jutla, Anwar Huq, and Rita Colwell

Vibrio spp. is typically found in salty waters and is indigenous to coastal environments. V. vulnificus and V. parahaemolyticus frequently causes food-borne and non-food-borne infections in the United States. Vibrio spp. is sensitive to changes in environmental conditions and various studies have explored their relationship with the environment and have identified water temperature as the strongest environmental predictor with salinity also affecting the abundance in some cases. It is unclear how additional environmental factors will affect intra-seasonal variance as well as the seasonal cycle. This study investigated the intra-seasonal variations in total and pathogenic V. parahaemolyticus and V. vulnificus organisms in oysters and surrounding waters from 2009 to 2012 at a few locations in the Chesapeake Bay. V. Vulnificus is always pathogenic, but it has been observed that there was greater sample-to-sample variability in pathogenic V. parahaemolyticus than in total V. parahaemolyticus. To determine the increase in the likelihood of vibrio presence when the value of a certain environmental parameter has changed, the odds ratio is examined for various values of environmental factors. The odds ratio that we employed measures the likelihood that the desired outcome would occur in samples with the vibrio in comparison to the likelihood that the desired outcome will occur in samples without the vibrio. This technique will give us the threshold value of the environmental variable above which the likelihood of vibrio spp. presence has increased drastically. With changing climate and environmental conditions, vibrio is posing increasing risks to human health. The findings of this study will demonstrate the effectiveness of the odds ratio technique in estimating the likelihood that vibrio abundance would increase when environmental conditions change, which can then be incorporated into prediction models to reduce the danger to the public's health.

How to cite: Gangwar, M., Brumfield, K., Usmani, M., Jamal, Y., Jutla, A., Huq, A., and Colwell, R.: Variability and the odds of Total and Pathogenic Vibrio abundance in Chesapeake Bay, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-593, https://doi.org/10.5194/egusphere-egu23-593, 2023.

EGU23-2467 | PICO | NH9.9

On the Understanding of the transmission route tied to Reported Influenza/Influenza-Like Illness Activity

Tom DeFelice, PhD

Some studies suggest atmospheric particulate matter with diameters 2.5 micron and smaller (PM2.5) may possibly play a role in the transmission of influenza and influenza-like illness (ILI) symptoms. Those studies were predominantly conducted under moderately to highly polluted outdoor atmospheres. We conducted our study to extend the understanding to include a less polluted atmospheric environment. A relationship between PM2.5 and ILI activity extended to include lightly to moderately polluted atmospheres could imply a comparatively more complicated transmission mechanism. We obtained concurrent PM2.5 mass concentration data, meteorological data and reported Influenza and influenza-like illness (ILI) activity for the light to moderately polluted atmospheres over the Tucson, AZ region. We found no relation between PM2.5 mass concentration and ILI activity. There was an expected relation between ILI, activity, temperature, and relative humidity. There was a possible relation between PM2.5 mass concentration anomalies and ILI activity. These results might be due to the small dataset size and to the technological limitations of the PM measurements. Further study is recommended since it would improve the understanding of ILI transmission and thereby improve ILI activity/outbreak forecasts and transmission model accuracies.

How to cite: DeFelice, PhD, T.: On the Understanding of the transmission route tied to Reported Influenza/Influenza-Like Illness Activity, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2467, https://doi.org/10.5194/egusphere-egu23-2467, 2023.

EGU23-5923 | ECS | PICO | NH9.9

Impact of global warming and Greenland ice sheet melting on malaria and Rift Valley Fever

Alizée Chemison, Dimitri Defrance, Gilles Ramstein, and Cyril Caminade

Mosquitoes are climate-sensitive disease vectors. They need an aquatic environment for the development of their immature stages (egg-larva-nymph). The presence and maintenance of these egg-laying sites depends on rainfall. The development period of mosquitoes is reduced when temperature increases, up to a lethal threshold. Global warming will impact vector’s distribution and the diseases they transmit. The last deglaciation taught us that the melting of the ice sheet is highly non-linear and can include acceleration phases corresponding to sea level rise of more than 4 m per century. In addition, glacial instabilities such as iceberg break-ups (Heinrich events) had significant impacts on the North Atlantic Ocean circulation, causing major global climate changes. These melting processes and their feedbacks on climate are not considered in current climate models and their detailed impacts on health have not yet been studied.

To simulate an accelerated partial melting of the Greenland ice sheet, a freshwater flux corresponding to a sea level rise of +1 and +3 m over a 50-year period is superimposed on the standard RCP8.5 radiative forcing scenario. These scenarios are then used as inputs for the IPSL-CM5A climate model to simulate global climate change for the 21^st century. These simulations allow to explore the consequences of such melting on the distribution of two vector-borne diseases which affect the African continent: malaria and Rift Valley Fever (RVF). Malaria is a parasitic disease that causes more than 200 million cases and more than 600,000 deaths annually worldwide. RVF causes deaths and high abortion rates in herds and poses health risks to humans through contact with infected blood. Former studies have already characterised the evolution of the global distribution of malaria according to standard RCPs. Using the same malaria mathematical models, we study the impact of an accelerated Greenland melting on simulated malaria transmission risk in Africa. Future malaria transmission risk decreases over the Sahel and increases over East African highlands. The decrease over the Sahel is stronger in our simulations with respect to the standard RCP8.5 scenario, while the increase over east Africa is more moderate. Malaria risk strongly increases over southern Africa due to a southern shift of the rain belt which is induced by Greenland ice sheet melting.,. For RVF, the disease model correctly simulates historical epidemics over Somalia, Kenya, Mauritania, Zambia and Senegal. However, our results show the difficulty to validate continental scale models with available health data. It is essential to develop climate scenarios that consider climate tipping points. Assessing the impact of these tipping point scenarios and the associated uncertainties on critical sectors, such as public health, should be a future research priority.

How to cite: Chemison, A., Defrance, D., Ramstein, G., and Caminade, C.: Impact of global warming and Greenland ice sheet melting on malaria and Rift Valley Fever, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5923, https://doi.org/10.5194/egusphere-egu23-5923, 2023.

EGU23-6855 | PICO | NH9.9

An early warning decision support system for disease outbreaks in the livestock sector

Paola Nassisi, Alessandro D'anca, Marco Mancini, Monia Santini, Marco Milanesi, Cinzia Caroli, Giovanni Aloisio, Giovanni Chillemi, Riccardo Valentini, Riccardo Negrini, and Paolo Ajmone Marsan

New climate regimes, variability and extreme events affect the livestock sector in many aspects, ranging from animal welfare, production, reproduction, diseases and their spread, feed quality and availability. Heat stress, especially when combined with excess or low humidity, exacerbates the perceived temperature or the drought conditions, respectively, increasing hazards for animals. Also, cold extremes, extraordinary windy conditions and altered radiation regimes are detrimental to both animals and fodder.

In this context, the EU-funded SEBASTIEN project aims to provide stakeholders with a Decision Support System (DSS) for more efficient and sustainable management, and consequent valuation, of the livestock sector in Italy. SEBASTIEN DSS will integrate GIS, environmental and biological variables to generate updated risk maps for livestock diseases and zoonoses and their spread, alerting about the expected occurrence of stressing conditions for animals due to abiotic and biotic factors.

The presence of parasites, vectors, and outbreaks will be combined with environmental data, gathered by spatially distributed meteorological and satellite monitoring, to detect conditions that can potentially favor or trigger the spread of related diseases. Sensor-based monitoring data will be integrated with the above information to determine ranges in animal parameters potentially associated with a higher risk of critical pathogen load or density of vectors potential carriers of diseases. Medium to long-term climate forecasts will support predicting possible shifts of favorable conditions that will open up new areas for parasites and pathogens. The vast amounts of data will be integrated and summarized into user-tailored information through a range of techniques, from empirical/statistical indicators to Machine Learning algorithms.

How to cite: Nassisi, P., D'anca, A., Mancini, M., Santini, M., Milanesi, M., Caroli, C., Aloisio, G., Chillemi, G., Valentini, R., Negrini, R., and Ajmone Marsan, P.: An early warning decision support system for disease outbreaks in the livestock sector, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6855, https://doi.org/10.5194/egusphere-egu23-6855, 2023.

EGU23-7652 | PICO | NH9.9

Forecasting the risk of vector-borne diseases at different time scales: an overview of the CLIMate SEnsitive DISease (CLIMSEDIS) Forecasting Tool project for the Horn of Africa

Cyril Caminade, Andrew P. Morse, Eric M. Fevre, Siobhan Mor, Mathew Baylis, and Louise Kelly-Hope

Vector-borne diseases are transmitted by a range of arthropod insects that are climate sensitive. Arthropods are ectothermic; hence air temperature has a significant impact on their biting and development rates. In addition, higher temperatures shorten the extrinsic incubation period of pathogens, namely the time required for an insect vector to become infectious once it has been infected. Rainfall also creates suitable conditions for breeding sites. The latest IPCC-AR6 report unequivocally concluded that recent climate change already had an impact on the distribution of important human and animal diseases and their vectors. For example, dengue is now transmitted in temperate regions of Europe, and malaria vectors are now found at higher altitudes and latitudes in the Tropics. Different streams of climate forecasts, ranging from short range numerical weather prediction (NWP) models to seasonal forecasting systems, to future climate change ensembles can be used to forecast the risk posed by key vector-borne diseases at different time scales.

This work will first introduce vector-borne disease forecasting system prototypes developed for different time scales and applications. Three examples will be presented; first a NWP driven model to forecast the risk of the animal disease Bluetongue in the UK, second the skill of the Liverpool malaria model simulations driven by seasonal forecasts in Botswana, and third the impact of RCP-SSP climate change scenarios on the risk posed by dengue and malaria at global scale. In addition, the use of mathematical disease models in anticipating disease risk will be presented, highlighting the limited uptake by policy makers. To bridge the academic/policy making gap, novel participatory approaches which include all actors need to be developed.

The CLIMate SEnsitive DISease Forecasting Tool (CLIMSEDIS) project aims to bridge that gap. The overall aim of CLIMSEDIS is to develop and build capacity in the use of an innovative user-friendly digital tool. CLIMSEDIS will allow end-user stakeholders to utilise forecasts and delineate sub-national risk of multiple climate sensitive diseases to inform timely and targeted intervention strategies in eight countries across the Horn of Africa. Disease prioritization exercise, scoping reviews and interactive workshops with stakeholders will be carried out. The final deliverable will consist in a web-based portal and a phone application that will be used, maintained, and developed further by key African regional partners. A presentation of the CLIMSEDIS project phases and its overall strategy will be presented.

How to cite: Caminade, C., Morse, A. P., Fevre, E. M., Mor, S., Baylis, M., and Kelly-Hope, L.: Forecasting the risk of vector-borne diseases at different time scales: an overview of the CLIMate SEnsitive DISease (CLIMSEDIS) Forecasting Tool project for the Horn of Africa, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7652, https://doi.org/10.5194/egusphere-egu23-7652, 2023.

EGU23-9509 | PICO | NH9.9 | Highlight

The first continental population dynamics model of the Asian tiger mosquito driven by climate and environment

Kamil Erguler, Cedric Marsboom, George Zittis, Yiannis Proestos, George Christophides, Jos Lelieveld, and William Wint

The Asian tiger mosquito, Aedes albopictus, is an invasive vector species. It is capable of transmitting more than 20 arboviruses, and is responsible for chikungunya, dengue, and zika transmission. Urbanisation, globalisation, and climate change are expected to expand its habitable range and increase the global vector-borne disease burden in the coming decades. To plan effective control strategies, early-warning and decision support systems are urgently needed.

We developed a climate- and environment-driven population dynamics model of Aedes albopictus with extensive geospatial applicability. The foundation of the model is the age- and stage-structured population dynamics model of Erguler et al. (2016)¹. We replaced its rainfall- and human population density-dependent breeding site component with a large-scale mechanistic ecological model. The extension effectively created an ecological-dynamic model hybrid capable of representing niche dependence and response to changing environmental and meteorological conditions over time and under various land characteristics. To the best of our knowledge, this is the first spatiotemporal mechanistic model developed with a capacity to learn from both vector presence and longitudinal abundance data.

We calibrated the model with an extensive field surveillance dataset by combining the data collected through the AIMSurv project, the first pan-European harmonized surveillance of Aedes invasive mosquito species of relevance for human vector-borne diseases, and the global surveillance records available from VectorBase MapVEu. By deriving the model structure and environmental dependencies from the literature and allowing a complete re-configuration of the entire parameter set, we asserted the biological relevance and geospatial applicability, which extends over Europe and North America.

We corroborate that temperate northern territories are becoming increasingly suitable for Aedes albopictus establishment, while neighbouring southern territories become less suitable, as climate continues to change. We identify potential hotspots over Europe and North America by employing the combination of vector abundance and activity as a proxy to pathogen transmission risk. By investigating routes of introduction to new territories, we demonstrate the significant role of dynamic environmental suitability in the highly efficient spread of this invasive mosquito.

The model is scheduled for integration into the "Climate-driven vector-borne disease risk assessment platform", to predict habitat suitability and dynamic abundance of important disease vectors and the risk of diseases transmitted by them at any location and time up to the end of the century. With the continental model of Aedes albopictus, the platform will reliably inform public health professionals and policy makers and contribute to the global strategies of integrated vector management.

¹ Erguler K, Smith-Unna SE, Waldock J, Proestos Y, Christophides GK, Lelieveld J, Parham PE. Large-scale modelling of the environmentally-driven population dynamics of temperate Aedes albopictus (Skuse). PloS one. 2016 Feb 12;11(2):e0149282.

How to cite: Erguler, K., Marsboom, C., Zittis, G., Proestos, Y., Christophides, G., Lelieveld, J., and Wint, W.: The first continental population dynamics model of the Asian tiger mosquito driven by climate and environment, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9509, https://doi.org/10.5194/egusphere-egu23-9509, 2023.

EGU23-10475 | ECS | PICO | NH9.9

Dynamic distribution modeling of arboviral vesicular stomatitis and its vector, the biting midge (Culicoides spp.): two case studies

Melanie Veron

Vesicular stomatitis (VS) is a multi-vector arboviral disease that affects livestock and has a significant impact on agriculture in both the US and Mexico. Biting midges (Culicoides species) are known vectors of VS. Presence-only species distribution models (SDMs) provide a powerful and versatile tool for estimating both the habitat suitability of biting midges and the distribution of VS, the disease they spread. Such models can improve our understanding of Culicoides ecology, provide opportunities for more efficient VS surveillance and mitigation, and help determine geographical areas where VS is endemic or vulnerable to potential future transmission.

Here, we discuss two case studies related to modeling the distribution of VS and its insect vector. The first focused on predicting the habitat suitability of biting midges, including C. sonorensis and its close relatives (C. variipennis, C. albertensis, and C. occidentalis), based on species presence records collected in the past hundred years from various sources. The second study involved directly estimating the distribution of VS in Mexico, where we used occurrence data in the form of confirmed VS cases in livestock from 2005-2020 in historically endemic regions of Mexico.

SDMs are typically generated using temporally static input data. However, we improved the accuracy of our predictions by applying the Maxent algorithm to time-specific input data, creating dynamic species distribution models and habitat suitability maps. For both case studies, a robust dynamic Maxent distribution modeling workflow was implemented using temporally matched occurrence and environmental data that were carefully selected in collaboration with domain experts.

How to cite: Veron, M.: Dynamic distribution modeling of arboviral vesicular stomatitis and its vector, the biting midge (Culicoides spp.): two case studies, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10475, https://doi.org/10.5194/egusphere-egu23-10475, 2023.

EGU23-12513 | PICO | NH9.9 | Highlight

Links between weather and seasonal influenza epidemics

Jan Kyselý, Hana Hanzlíková, Aleš Urban, Eva Plavcová, and Jan Kynčl

Links between weather variability, influenza/acute respiratory infections (ARI), and human health are extremely complex in the cold season, and their explanation remains uncertain. It is not clear whether the winter mortality peak is related rather to low ambient temperatures or ARI, and how weather variability may modify transmission patterns of ARI and related mortality. This study investigates links between weather characteristics, influenza/ARI epidemics and all-cause mortality in the population of the Czech Republic (Central Europe), by employing long-term epidemiological and meteorological datasets over the 1982/83 to 2019/20 epidemics seasons. The links are analysed with respect to the predominant type of influenza virus in each season (A/H3N2 and A/H1N1 subtypes, and B lineages). We focus on i) identification of meteorological conditions associated with epidemics, ii) how timing of the epidemics and their magnitude are linked to weather characteristics, and iii) whether there are synergetic effects of cold weather and epidemics on the mortality impacts. Preliminary results suggest that high excess mortality during influenza epidemics was associated with low temperatures while above-average temperatures were linked to lower morbidity and mortality impacts. The role of other meteorological characteristics is less clear. Understanding weather conditions that increase the transmission and survival of influenza and respiratory viruses could help to better inform at-risk populations, implement preventive measures, and mitigate the negative impacts of influenza and ARI.

How to cite: Kyselý, J., Hanzlíková, H., Urban, A., Plavcová, E., and Kynčl, J.: Links between weather and seasonal influenza epidemics, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12513, https://doi.org/10.5194/egusphere-egu23-12513, 2023.

EGU23-13298 | ECS | PICO | NH9.9

Prognostic epidemiological indices and the fate of ongoing infectious disease outbreaks

Cristiano Trevisin and Andrea Rinaldo

Prognostic indices, such as the reproduction number or the epidemicity index, help assess the fate of ongoing infectious disease epidemics. While the first is of established importance, the latter focuses on the instantaneous reactivity of the infective compartment to new flare-ups. When subthreshold values of such indices apply (respectively, below the unity for the first and below zero for the latter), they warrant long-term disease-free and unreactive epidemiological conditions.

These prognostic indicators benefit policymakers during the assessment and implementation of containment measures to reduce the disease burden. They may depend on an array of factors, including environmental forcings and the effect of containment measures on disease transmission.

We showcase a possible implementation of such prognostic indices with reference to the disastrous 2010-2019 Haiti cholera outbreak. To this end, we use a compartmental model that considers rainfalls as an environmental forcing and societal actions tackling the disease's spread. We thus test several scenarios considering a different deployment of intervention measures and we evaluate the outcome of the evolution of the prognostic indices and the epidemiological trajectory in the Haitian regions. We find that subthreshold values of these indices lead to faster waning-disease conditions.
As these indices can recap diverse epidemiological signatures induced by the spatial and temporal deployment of containment measures and potentially by evolving environmental forcings, their implementation could enable policymakers to strategically adopt containment measures in response to both evolving epidemiological and climate forcings.

How to cite: Trevisin, C. and Rinaldo, A.: Prognostic epidemiological indices and the fate of ongoing infectious disease outbreaks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13298, https://doi.org/10.5194/egusphere-egu23-13298, 2023.

EGU23-14214 | ECS | PICO | NH9.9

A Deep Early Warning System of Mosquito Borne Diseases using Earth Observational Data

Argyro Tsantalidou, Konstantinos Tsaprailis, George Arvanitakis, Diletta Fornasiero, Daniel Wohlgemuth, Dusan Petric, and Charalampos Kontoes

Mosquito-borne diseases (MBDs) have been spreading across many countries including Europe over the past two decades, causing thousands of deaths annually. They are transmitted through the bites of infectious mosquitoes. Environmental, meteorological and other spatio-temporally variables affect the mosquito abundance (MA), and thus affect the circulation of the MBDs in the community. So an early warning system of MA based on these parameters could serve as a warning for the upcoming MBDs incidence.

We propose Deep-MAMOTH, a data driven, generic and accurate early warning system for predicting MA in the upcoming period, based on earth observational (EO) environmental data and optionally in-situ entomological data. Deep-MAMOTH can be easily replicated and applied to multiple areas of interest without any special parametrization.

The Deep-MAMOTH pipeline collects EO information from various data sources (temperature, rainfall, vegetation, distance from coast, elevation, etc.) and in-situ entomological data for each area of interest. Then, there is a feature extraction phase that combines the previous collected information to more complex features, and finally this data is fed into a Deep Neural Network responsible to capture the relationship between the above mentioned features and the MA, delivering a MA risk class ordered from 0 to 9 for the upcoming period (e.g. 15 days). The pipeline provides a standardized way to predict MA without depending on the area of interest or the mosquito genus and can be modified to predict the actual MA instead of a risk class. However, risk classes help to better propagate the severity of the situation.

Two versions of Deep-MAMOTH were implemented, the first one is using recently collected entomological information in order to produce predictions (i.e. mosquitoes collected 1 week ago). The other version works when there is no recently collected entomological information for the area of interest. The latter version is expected to perform worse than the first one, but gives us the capability to produce predictions anywhere on earth without the need of recently collected entomological data.

We applied Deep-MAMOTH in Veneto (Italy), in Upper Rhine region (Germany), and the Vojvodina region (Serbia) regarding the Culex spp. genus mosquito. The results are promising as Deep-MAMOTH in Italy achieves a mean absolute error (MAE) of 1.27 classes with the percentage of predictions that deviate at most 3 classes (e3) from the actual one reaching up to 95%. In Serbia MAE is 1.77 classes, with e3 equal to 88% and finally for Germany MAE is 0.92 classes and e3 equal to 94%.

It’s worth mentioning that prediction performance in the version of Deep-MAMOTH without using entomological information remains promising. MAE in Italy was increased only by 0.02 and in Germany by 0.1, with e3 remaining at the same level in both cases, while in Serbia MAE increased by 0.2 with e3 decreasing by 8%. We conclude that the prediction of MA from EO data can be accurate with or without recently collected entomological data.

Acknowledgment:This research has been co-financed by the ERD Fund of the EU and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, RESEARCH-CREATE-INNOVATE(project code:T2EDK-02070)

How to cite: Tsantalidou, A., Tsaprailis, K., Arvanitakis, G., Fornasiero, D., Wohlgemuth, D., Petric, D., and Kontoes, C.: A Deep Early Warning System of Mosquito Borne Diseases using Earth Observational Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14214, https://doi.org/10.5194/egusphere-egu23-14214, 2023.

EGU23-15398 | PICO | NH9.9

Understanding the Area of Applicability of Data Driven Mosquito Abundance Prediction Models

Theoktisti Makridou, George Arvanitakis, Konstantinos Tsaprailis, Diletta Fornasiero, and Charalampos Kontoes

An Early Warning System for mosquito abundance is a valuable tool that can alert authorities for potential outbreaks of mosquito populations in a given area for the upcoming period. This information is used to take mitigation actions in order to avoid spread of vector borne diseases such as West-Nile Virus, Malaria, Zika etc. A promising direction of those systems today aims to predict the upcoming mosquito population by following a data driven approach and taking advantage of machine learning (ML) algorithms. The ML algorithms are trained on a limited set of point level data that include the environmental, geomorphological, climatic information and historical in-situ measurements of mosquito population for specific latitude and longitude coordinates. Goal of the ML algorithms is to learn the patents that connect the characteristics (features) of a given area (temperature, humidity, NDVI, rainfall, latitude, longitude, etc) with the upcoming mosquito population.

Once the in-situ entomological data are expensive to be collected and limited, one of the key challenge of the aforementioned approach is to understand where those models can generalize with an acceptable accuracy in order to be re-used in areas that prior entomological information do not exist or in other words to understand the area of applicability of those models.

In this study we analyze the performance of ML algorithms that have been trained in specific areas and applied to “unseen” areas. Our analysis aims to understand the characteristics of the cases where the algorithms manage to generalize compared with the ones where the performance is poor. Our scope is to establish a systematic approach for determining the area of applicability of the models, thus, to obtain a prior knowledge regarding the areas that we expect models to generalize properly and the areas the predictions of the models are not trustworthy.

Our work relied on historical data of Culex pipiens mosquitoes (West Nile virus) collected in the Veneto region of Italy for the decade 2011-2021 and satellite Earth Observation data. For ML regressor we used a feedforward Neural Network with typical mean square error cost function. Initially we conclude that the typical euclidian distance between the coordinates of the trained area and the unseen data is not an informative metric about the model’s area of applicability. Instead, we propose a metric that calculates the distance between the known and the unknown points in the feature space (environmental, geomorphological etc.) and also takes into account the feature importance of trained Neural Network using the SHAP values.

The results showed that our proposed metric is informative regarding where the model is expected to have more accurate predictions and manage to capture the cases where the generalization will be poor. This information is useful both to judge if the predictions of a model are trustworthy and also to understand for which areas our prior information is not sufficient and to take actions in future network planning.

How to cite: Makridou, T., Arvanitakis, G., Tsaprailis, K., Fornasiero, D., and Kontoes, C.: Understanding the Area of Applicability of Data Driven Mosquito Abundance Prediction Models, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15398, https://doi.org/10.5194/egusphere-egu23-15398, 2023.

EGU23-17226 | ECS | PICO | NH9.9 | Highlight

Identification of thresholds on Sea surface temperature and coastal chlorophyll for understanding environmental suitability of V. vulnificus incidence

Yusuf Jamal, Moiz Usmani, Mayank Gangwar, and Antarpreet Jutla

Vibrio spp. are pathogenic bacteria native to warm and brackish water. Vibriosis- the disease caused by these pathogens in humans accounts for around 80000 illnesses and 100 deaths annually in the United States. Of all the species, V. vulnificus has the highest mortality rate of all seafood-borne pathogens in the United States. In this context, understanding the environmental conditions that lead to increased V. vulnificus growth and spread can aid in the development of early warning systems and targeted prevention strategies. Besides sea surface temperature (SST), biotic parameters like coastal chlorophyll are also determined to affect V. vulnificus incidence in humans locally. However, the precise role of coastal chlorophyll as a potential confounding variable is understudied. Moreover, the spatial scale to which the data for environmental variables could be obtained also poses characterization constraints for researchers since the commonly employed in-situ sampling-based methods usually work with discrete locations covering a small area. The present study uses the odds ratio analysis to determine SST and chlorophyll-a threshold values critical to V. vulnificus incidence. The analysis reveals a definite positive relationship between remotely derived environmental variables and the odds of V. vulnificus incidence, where a specific statistical value of SST and chlorophyll-a marks a clear distinction between low and high odds of V. vulnificus incidence. This finding translates into a consistent pattern when checked for counties of coastal Florida. We anticipate our methodology to help distinguish between high and low-risk conditions, enabling public health workers to take proactive measures to protect the health and well-being of the public.

How to cite: Jamal, Y., Usmani, M., Gangwar, M., and Jutla, A.: Identification of thresholds on Sea surface temperature and coastal chlorophyll for understanding environmental suitability of V. vulnificus incidence, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17226, https://doi.org/10.5194/egusphere-egu23-17226, 2023.

EGU23-345 | ECS | Posters on site | GI1.3

Wet deposition of heavy metals, reactive nitrogen species and dissolved organic carbonat a residential site in Delhi region, India

sunaina sunaina

The deposition of heavy metals on water bodies and soil has adverse consequences on
human health. The elevated Coal-based energy production and increased industrial emissions
have also prompted us to study about heavy metals reactive nitrogen species in the
atmosphere. In the present work, the samples of rain water were collected from a residential
site in south-west Delhi. The samples were analyzed for selected heavy metals by using ICP-
OES. The heavy metals analysis involved voltammetry method using 797 VA Computrace
(Metrohm, Switzerland) instrument. The analysis of Total Nitrogen (TN) and dissolved
organic carbon (DOC) was carried out by using chemiluminescence based TN/TOC analyzer
(Shimadzu model TOC-LCPH E200 ROHS). The mean values of Cu, Mn, Zn, Al, As and Hg
were calculated as 34.5 mg/l, 19.5 mg/l, 52.7 mg/l, 392.3 mg/l, 9.8 mg/l and 1.6 mg/l
respectively. The mean values for TN and DOC were 12.7mg/l and 2.8 mg/l respectively. The
detailed results will be discussed in the EGU General Assembly Meeting.

Keywords: Total Nitrogen, wet deposition, ICP-OES, voltammetry, agricultural area.

How to cite: sunaina, S.: Wet deposition of heavy metals, reactive nitrogen species and dissolved organic carbonat a residential site in Delhi region, India, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-345, https://doi.org/10.5194/egusphere-egu23-345, 2023.

EGU23-1075 | ECS | Posters on site | GI1.3

What can we learn from nested IoT low-cost sensor networks for air quality? A case study of PM2.5 in Birmingham UK.

Nicole Cowell, Clarissa Baldo, William Bloss, and Lee Chapman

Birmingham is a city within the West Midlands region of the United Kingdom. In June 2021, coinciding with the introduction of the Clean Air Zone by Birmingham City Council (BCC), multiple low-cost IoT sensor networks for air pollution were deployed across the city by both the University of Birmingham and BCC. Low-cost sensor networks are growing in popularity due to their lower costs compared to regulatory instruments (£10’s-£1000’s per unit compared to £10,000+ per unit) and the reduced need for specialised staff allow for deployments at greater spatial scales (1-3). Although such low-cost sensing is often associated with uncertainty, the measurement of PM_2.5optical particle counters have been generally shown to perform well, giving indicative insight into concentrations following calibrations and corrections for external influence such as humidity (4-7).

One common problem with sensor networks is they tend to be isolated and unopen deployments, deployed and maintained by an interested party with the focus of their own monitoring goal. To tackle this, Birmingham Urban Observatory was an online platform created and used by researchers at the University of Birmingham to host and share open access meteorological and air pollution data from low-cost sensor deployments. Whilst hosting and displaying data from two of their own deployments of air quality sensors (Zephyrs by Earthsense and AltasensePM: an in-house designed PM sensor), the platform also pulled data from the DEFRA AURN sites and collaborated with local government to pull data from their own low-cost sensor network. The result was a real-time view of environmental data produced from a series of nested arrays of sensors.

This poster presents findings from this combined low-cost network, considering the successes and pitfalls of the low-cost monitoring network alongside insight into regional and local PM_2.5 concentrations. Colocations against reference instruments within the network demonstrate good performance of the low-cost sensors after calibration and data validation but the project experienced challenges in deploying the network and sensor reliability. Low-cost sensor data generally gives novel insight into spatial analysis of PM_2.5 across the city and this is presented alongside other experiences of deploying and using sensor networks for air quality.

1 Lewis et al., (2016) https://doi.org/10.1039/C5FD00201J

2 Chong and Kumar. (2003) doi: 10.1109/JPROC.2003.814918

3 Snyder et al., (2013) https://doi.org/10.1021/es4022602

4 Magi et al., (2020) https://doi.org/10.1080/02786826.2019.1619915

5 Crilley et al., (2018) https://doi.org/10.5194/amt-11-709-2018

6 Cowell et al., (2022) https://doi.org/10.3389/fenvs.2021.798485

7 Cowell et al., (2022) https://doi.org/10.1039/D2EA00124A

How to cite: Cowell, N., Baldo, C., Bloss, W., and Chapman, L.: What can we learn from nested IoT low-cost sensor networks for air quality? A case study of PM2.5 in Birmingham UK., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1075, https://doi.org/10.5194/egusphere-egu23-1075, 2023.

EGU23-2847 | Posters on site | GI1.3

Atmospheric ammonia in-situ long-term monitoring: review worldwide strategies and recommendations for implementation

Aude Bourin, Pablo Espina-Martin, Anna Font, Sabine Crunaire, and Stéphane Sauvage

Ammonia (NH₃) is the major alkaline gas in the atmosphere and the third most abundant N-containing species, after N₂ and N₂O. It plays an important role in N deposition processes, responsible of several damages on ecosystems, and it is also a precursor of fine particulate matter, known to cause numerous impacts on human health. Despite this, not many countries have implemented long-term monitoring of NH₃in their air quality programs due to the lack of consensus on limit values for ambient levels and a reference method of measuring this gas. In the climate change context, governments and health organizations are increasingly concerned about NH₃ and its effects. As a proof, the revision of the EU air quality directives proposes the inclusion of NH₃ as a mandatory pollutant for several urban and rural supersites for all member states.

Currently, there are only 12 long term programs worldwide dedicated specifically to measure NH₃ or including gas-phase measurements of NH₃. The longest NH₃ time series come from UK and Africa, where measurements start in mid-1990. The rest of locations have started after 2000 and they have lower temporal coverage, between 5 and 22 years. The objectives pursued by these networks are to follow long term spatio-temporal trends, assess the N deposition on sensitive ecosystems, validate emission and/or chemistry transport models and help to understand the effectiveness of air pollution control and mitigation policies. Most of these networks operate using a combination of low-cost samplers with a high spatial density with few collocated sites with high time resolution instrumentation to help calibrate passive samplers and to better monitor the fine temporal variability of NH₃. This combined approach has proven to be successful for most of the proposed objectives.

However, there are several differences that may difficult harmonizing the information at both the technical and scientific level. At the technical level these include type and number of passive samplers per site, calibration protocol, data control and quality analysis, exposure duration and type of high time resolution sampling method. On the scientific level, increased difficulty understanding the operative parameters and scientific results may come from language barriers (non-English reports), availability of the data (whether it is public or not), and gaps on the knowledge of NH₃ levels on a spatial scale due to differences in the implementation of monitoring strategies within the same country.

This work aims to review synthetically the world current long-term NH₃ networks and provide some insight and recommendations for other countries and supranational programs aiming to establish long term monitoring networks of NH₃, based on cost-effective, technical, and operational criteria.

How to cite: Bourin, A., Espina-Martin, P., Font, A., Crunaire, S., and Sauvage, S.: Atmospheric ammonia in-situ long-term monitoring: review worldwide strategies and recommendations for implementation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2847, https://doi.org/10.5194/egusphere-egu23-2847, 2023.

EGU23-2984 | ECS | Posters virtual | GI1.3

The COllaborative Carbon Column Observing Network COCCON: Showcasing GHG observations at the COCCON Tsukuba site

Matthias Max Frey, Isamu Morino, Hirofumi Ohyama, Akihiro Hori, Darko Dubravica, and Frank Hase

Greenhouse gases (GHGs) play a crucial role regarding global warming. Therefore, precise and accurate observations of anthropogenic GHGs, especially carbon dioxide and methane, are of utmost importance for the estimation of their emission strengths, flux changes and long-term monitoring. Satellite observations are well suited for this task as they provide global coverage. However, like all measurements these need to be validated.

The COllaborative Carbon Column Observing Network (COCCON) performs ground-based observations to retrieve column-averaged dry air mole fractions of GHGs (XGAS) with reference precision. The instrument used by the network is the EM27/SUN, a solar-viewing Fourier Transform infrared (FTIR) spectrometer. COCCON data are of high accuracy as COCCON uses species dependent airmass-independent and airmass-dependent adjustments for tying the XGAS products to TCCCON (Total Carbon Column Observing Network) and thereby to the World Meteorological Organization (WMO) reference scale. Moreover, instrument specific characteristics are measured for each COCCON spectrometer, and taken into account in the data analysis.

Here we first introduce the COCCON network in general and summarize its capabilities for various challenges including satellite and model validation, long-term observation of GHGs, and local and regional GHG source emission strength estimations. By example of the COCCON Tsukuba station we highlight in detail its usefulness for the above-mentioned applications.

How to cite: Frey, M. M., Morino, I., Ohyama, H., Hori, A., Dubravica, D., and Hase, F.: The COllaborative Carbon Column Observing Network COCCON: Showcasing GHG observations at the COCCON Tsukuba site, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2984, https://doi.org/10.5194/egusphere-egu23-2984, 2023.

EGU23-7462 | ECS | Orals | GI1.3

Data infrastructure for nitrogen compound emissions monitoring

Daniel Bertocci, Burcu Celikkol, Shaojie Zhuang, and Jasper Fabius

Emissions of nitrogen compounds, including nitrogen dioxide (NO₂) and ammonia (NH₃), have significant impacts on air quality and the environment. To effectively monitor the spatial and temporal variability of these emissions and the efficacy of emission mitigation measures, OnePlanet Research Center is developing a low-cost sensor system to monitor outdoor NO₂ and NH₃concentrations. This sensor system is designed to be deployable in fine-grained networks to accurately capture the dispersion from an emitting source. The deployment of multitudes of such sensor systems will result in large volumes of data. For this purpose, we developed a data infrastructure using the OGC SensorThings API and TimescaleDB, a time-series database extending PostgreSQL. This infrastructure allows for the efficient storage, management, and analysis of large volumes of spatiotemporal data from various sources, such as air quality monitoring networks, meteorological data, and agricultural practices. We demonstrate the potential of this infrastructure by using it in citizen science project COMPAIR, combining data from various sensors to gain insights on the air quality impact of urban circulation policies. The resulting data platform will facilitate the development of decision support tools and the implementation of targeted emission reduction strategies.

How to cite: Bertocci, D., Celikkol, B., Zhuang, S., and Fabius, J.: Data infrastructure for nitrogen compound emissions monitoring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7462, https://doi.org/10.5194/egusphere-egu23-7462, 2023.

EGU23-7839 | ECS | Orals | GI1.3

Comparing Low-Cost Sensors with Ground-Based and Airborne In-Situ and Column Observations of NO2 and HCHO during the FRAPPE Field Campaign in Colorado, USA

Kristen Okorn, Laura Iraci, and Michael Hannigan

Even in the presence of more reliable air quality tools, low-cost sensors have the benefit of recording data on highly localized spatial and temporal scales, allowing for multiple measurements within a single satellite pixel and on pixel boundaries. However, they are less accurate than their regulatory-grade counterparts, requiring regular co-locations with accepted instruments to ensure their validity. Thus, the addition of low-cost sensors to a field campaign – where reference-grade air quality instruments are abundant – not only provides ample opportunities for low-cost sensor co-location and calibration, but also allows the low-cost instruments to be used for sub-pixel validation, covering more surface area than the regulatory instruments alone with a network of sensors. During the summer of 2014, our low-cost sensor network was deployed during the Front Range Air Pollution and Photochemistry Éxperiment (FRAPPÉ) campaign conducted to sample the composition of air at and above ground level in northeastern Colorado, USA. The low-cost sensor platform included a suite of gas-phase sensors, notably NO₂ and two generalized volatile organic compound (VOC) sensors, which were leveraged together to quantify speciated hydrocarbons such as formaldehyde. These key pollutants were chosen for their impacts on human health and climate change, as well as their inclusion on the TEMPO satellite launching this year. Airborne campaign measurements included slant column optical observations of formaldehyde (HCHO), nitrogen dioxide (NO₂), and ozone (O₃). Myriad additional in-situ instruments described chemical composition up to approximately 5 km above surface level. Ground-based instrumentation included both stationary and mobile regulatory-grade instruments, which were used for sensor calibration. Machine learning techniques such as artificial neural networks (ANNs) were used to match the low-cost signals to that of the reference-grade instruments. Here, we compare calibrated low-cost sensor data collected at ground level in a variety of locations along Colorado’s Front Range to various data sources from the FRAPPÉ campaign to better understand how well airborne and regulatory ground-based measurements can be extrapolated to other locations. Further, as the slant column measurements act as satellite simulators, we explore how low-cost instruments can be used for satellite validation purposes. Comparisons among these different data types also have important implications in data fusion.

How to cite: Okorn, K., Iraci, L., and Hannigan, M.: Comparing Low-Cost Sensors with Ground-Based and Airborne In-Situ and Column Observations of NO2 and HCHO during the FRAPPE Field Campaign in Colorado, USA, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7839, https://doi.org/10.5194/egusphere-egu23-7839, 2023.

EGU23-8631 | Posters on site | GI1.3 | Highlight

Ambient conditions and infrared sky brightness in the Chilean Atacama Desert

Wolfgang Kausch, Stefan Kimeswenger, Stefan Noll, and Roland Holzlöhner

The Atacama Desert in the Chilean Andes region is one of the dryest areas in the world. Due to its unique location with stable subtropical meteorological conditions and high mountains, it is an ideal site for the astronomical telescope facilities of the European Southern Observatory (ESO). The special meteorological conditions are continuously monitored at Cerro Paranal (the location of the Very Large Telescope) by measuring various parameters like temperature, pressure, humidity, precipitable water vapour (PWV), wind speed and direction, and sky radiance and bolometric sky temperature, respectively, the latter being crucial for astronomical observations in the thermal infrared regime. ESO operates several site monitoring systems for that purpose, e.g. the ESO MeteoMonitor, the Differential Image Motion Monitor (DIMM) and a Low Humidity And Temperature PROfiler (L-HATPRO) microwave radiometer providing detailed water vapour and temperate profiles up to a height of 12km in various directions.

We have assembled all available data for a period of 4.5 years (2015-07-01 through 2019-12-31) and created a unique data set from it. This period also covers the strong El Niño event at the end of 2015. In this poster we present statistical results on the overall conditions and trends, and compare our measurements of the nocturnal sky brightness with an empirical model as function of the ambient temperature, PWV and zenith distance.

How to cite: Kausch, W., Kimeswenger, S., Noll, S., and Holzlöhner, R.: Ambient conditions and infrared sky brightness in the Chilean Atacama Desert, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8631, https://doi.org/10.5194/egusphere-egu23-8631, 2023.

EGU23-8841 | Orals | GI1.3

Spatio-temporal clustering on a high-performance computing platform for high-resolution monitoring network analysis

Colin Lee, Paul Makar, and Joana Soares

Air quality monitoring networks provide invaluable data for studying human health, environmental impacts, and the effects of policy changes, but obtaining high quality data can be costly, with each site in a monitoring network requiring instrumentation and skilled operator time. It is therefore important to ensure that each monitor in the network is providing unique data to maximize the value of the entire network. Differences in measurement approaches for the same chemical between monitoring stations may also result in discontinuities in the network data. Both of these factors suggest the need for objective, machine-learning methodologies for monitoring network data analysis.

Air quality models are another valuable tool to augment monitoring networks. The models simulate air quality over a large region where monitoring may be sparse. The gridded output from air-quality models thus contain inherent information on the similarity of sources, chemical oxidation pathways and removal processes for chemicals of interest, provided appropriate tools are available to identify these similarities on a gridded basis. The output from these models can be immense, again requiring the use of special, highly optimized tools for post-processing analysis.

Spatiotemporal clustering is a family of techniques that have seen widespread use in air quality, whereby time-series taken at different locations are grouped based on the level of similarity between time-series data within the dataset. Hierarchical clustering is one such algorithm, which has the advantage of not requiring an a priori assumption about how many clusters there might be (unlike K-means). However, traditional approaches for hierarchical clustering become computationally expensive as the number of time-series increases in size, resulting in prohibitive computational costs when the total number of time-series to be compared rises above 30,000, even on a supercomputer. Similarly, the comparison and clustering of large numbers of discrete data (such as multiple mass spectrometer data sampled at high time resolution from a moving laboratory platform) becomes computationally prohibitive using conventional methods.

In this study we present a high-performance hierarchical clustering algorithm which is able to run in parallel over many nodes on massively parallel computer systems, thus allowing for efficient clustering for very large monitoring network and model output datasets. The new high-performance program is able to cluster 290,000 annual time series (from either monitoring network data or gridded model output) in 13 hours on 800 nodes. We present here some example results showing how the algorithm can be used to analyse very large datasets, providing new insights into “airsheds” depicting regions of similar chemical origin and history, different spatial regimes for nitrogen, sulphur, and base cation deposition, . These analyses show how different processes control each species at different potential monitoring site locations, via cluster-generated airshed maps for each species. The efficiency and flexibility of the algorithm allows for extremely large datasets to be analysed in hours of wall-clock time instead of weeks or months. The new algorithm is being used as the numerical engine for a new tool for the analysis of EU monitoring network data.

How to cite: Lee, C., Makar, P., and Soares, J.: Spatio-temporal clustering on a high-performance computing platform for high-resolution monitoring network analysis, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8841, https://doi.org/10.5194/egusphere-egu23-8841, 2023.

EGU23-9356 | Orals | GI1.3

Application of cost-efficient particulate matter measurement devices in an urban network and comparison to state-of-the-art air quality monitoring

Roland Schrödner, Honey Alas, and Jens Voigtländer

22 cost-efficient (aka ‘low-cost’) commercially available particulate matter (PM) measurement devices were installed in a diverse urban area in Leipzig, Germany. The instruments measure mostly PM2.5, some additionally PM10, and are equipped with methods for quality assurance such as conditioning to a defined temperature and regular internal calibration. In order to investigate the spread between the instruments and to enable a pre-campaign calibration, all instruments were setup in the laboratory and the outside air and compared against the same reference measurements.

Since July 2022, the measurement network was installed. It covers roughly 2x2 km² and holds different urban features like residential and commercial buildings, important main roads, city parks, and small open building gaps. Within the network there is an official air quality monitoring station located directly at a main road. In addition, at two further official monitoring stations as well as at observation stations of the Leibniz Institute for Tropospheric Research instruments were installed to study the long-term performance, dependence on meteorological conditions and comparison to reference measurements. The measurements will take place until end of 2023.

The cost-efficient instruments perform generally quite well after the calibration. In particularly for higher PM loads > 10 µg m^-3 the agreement against references is mostly satisfying. However, under very high relative humidity and cold temperatures, some instruments lacked to condition the air sufficiently. Despite these difficulties, the chosen instruments have the potential for application in monitoring of air quality limit values, i.e. the answer the question how often are certain limits exceeded.

Furthermore, differences between different local features in the observation area could be observed in e.g., the diurnal cycle but also peak and mean concentrations.

This work is co-financed with tax funds on the basis of the budget passed by the Saxon State Parliament (funding number 100582357).

How to cite: Schrödner, R., Alas, H., and Voigtländer, J.: Application of cost-efficient particulate matter measurement devices in an urban network and comparison to state-of-the-art air quality monitoring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9356, https://doi.org/10.5194/egusphere-egu23-9356, 2023.

EGU23-9537 | Posters on site | GI1.3

The Global Environmental Measurement and Monitoring Initiative – An International Network for Local Impact

Daniel Klingenberg, D. Michelle Bailey, David Lang, and Mark Shimamoto

The Global Environmental Measurement and Monitoring (GEMM) Initiative is an international project of Optica and the American Geophysical Union seeking to provide precise and usable environmental data for local impact. The Initiative brings together science, technology, and policy stakeholders to address critical environmental challenges and provide solutions to inform policy decisions on greenhouse gases (GHGs) and air and water quality. GEMM Centers are currently established in Scotland, Canada, New Zealand, and the United States. These Centers represent partnerships with leading institutions that are actively working toward developing or deploying new measurement technology and improved climate models. Additional Centers are under development in India and Australia with plans to expand to Asia and Africa.

In addition to establishing monitoring centers worldwide, GEMM actively engages with other sectors (including industry, standards organizations, and regional or national governments) to support the incorporation or adoption of these evidence-based approaches into decision making processes. For example, Glasgow, Scotland is piloting the GEMM Urban Air Project, deploying a low-cost, real-time, ground-based network of devices that continuously monitors GHGs and air pollutants at a neighborhood scale. The sensor network in Glasgow is increasing the precision of local models that can provide the city with information to assess current policies and support future action. Here we will share the progress and outputs of the GEMM Initiative to date and highlight paths forward to grow the network.

How to cite: Klingenberg, D., Bailey, D. M., Lang, D., and Shimamoto, M.: The Global Environmental Measurement and Monitoring Initiative – An International Network for Local Impact, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9537, https://doi.org/10.5194/egusphere-egu23-9537, 2023.

EGU23-11033 | ECS | Posters on site | GI1.3

First HFC-134a retrievals and analysis of long-term trends from FTIR solar spectra above NDACC network stations: the Jungfraujoch case

Irene Pardo Cantos and Emmanuel Mahieu

Since the discovery of the chlorofluorocarbons (CFCs) implication in stratospheric ozone destruction, the Montreal Protocol (1987) has aimed at controlling the production of CFCs and other ozone depleting substances (ODS) in order to protect and then recover the ozone layer. Consequently, temporary substitutes for CFCs have been developed and produced by the industry. First substitute molecules were hydrochlorofluorocarbons (HCFCs), which have smaller ozone depletion potentials (ODP) than CFCs since their atmospheric lifetimes are shorter. Nevertheless, HCFCs still contain chlorine atoms and hence, also deplete the stratospheric ozone, requiring them to be banned in turn. Thus, chlorine-free molecules, i.e. hydrofluorocarbons (HFCs) such as CH₂FCF₃ (HFC-134a) were introduced to replace both CFCs and HCFCs. Even if HFCs do not contribute to ozone depletion, they are very powerful greenhouse gases since they have great global warming potentials (GWPs). Consequently, the Kigali amendment (2016) to the Montreal Protocol aimed for their phase-out.

The atmospheric concentrations of CFCs have decreased in response to the phase-out and ban of their production by the Montreal Protocol and its subsequent amendments, while the HCFCs burden is now leveling off. In contrast, the atmospheric concentrations of HFCs have increased notably in the last two decades.

We present the first retrievals of HFC-134a from Fourier Transform Infra-Red (FTIR) solar spectra obtained from a remote site of the Network for the Detection of Atmospheric Composition Change (NDACC.org): the Jungfraujoch station (Swiss Alps). We discuss of the applicability of our retrieval strategy to other NDACC sites, for future quasi global monitoring from ground-based observations. We further perform first comparisons with other datasets as ACE-FTS satellite observations.

How to cite: Pardo Cantos, I. and Mahieu, E.: First HFC-134a retrievals and analysis of long-term trends from FTIR solar spectra above NDACC network stations: the Jungfraujoch case, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11033, https://doi.org/10.5194/egusphere-egu23-11033, 2023.

EGU23-11385 | ECS | Orals | GI1.3

Multi-instrumental approach for air quality monitoring: characterization and distribution of greenhouse gases and atmospheric metal deposition in the Greve River Basin (Chianti territory, Central Italy).

Rebecca Biagi, Martina Ferrari, Franco Tassi, and Stefania Venturi

Monitoring networks, able to effectively provide high-frequency geochemical data for characterizing the geochemical behavior of the main greenhouse gases (i.e., CO₂ and CH₄) and pollutants (e.g., heavy metals) are crucial tools for the assessment of air quality and its role in climate changes. However, the provision of measurement stations dedicated to monitor gas species and particulate in polluted areas is complicated by the high cost of their set-up and maintenance. In the last decade, traditional instruments have tentatively been coupled with low-cost sensors for improving spatial coverage and temporal resolution of air quality surveys. The main concerns of this new approach regard the in-field accuracy of the low-cost sensors, being significantly dependent on: (i) cross-sensitivities to other atmospheric pollutants, (ii) environmental parameters (e.g., relative humidity and temperature), and (iii) detector signal degradation over time.

This study presents the results of a geochemical survey carried out in the Greve River Basin (Chianti territory, Central Italy) from May to September 2022 by adopting two measuring strategies: (i) deployment of a mobile station, along predefined transepts within the Greve valley, equipped with a Picarro G2201-i analyzer to measure CO₂ and CH₄ concentrations and δ¹³C-CO₂ and δ¹³C-CH₄ values (‰ vs. V-PDB) by Wavelength-Scanned Cavity Ring-Down Spectroscopy (WS-CRDS); (ii) continuous monitoring, at five fixed sites positioned at different altitudes, of CO₂ and CH₄ concentrations through prototyped low-cost stations, coupled with atmospheric deposition and rain samplers to collect particulate samples for chemical lab analysis. The low-cost monitoring stations housed (i) a non-dispersive infrared (NDIR) sensor for CO₂ concentrations, (ii) a solid-state metal oxide sensor (MOS) for CH₄ concentrations, (iii) a laser light scattering sensor (LSPs) for PM_2.5 and PM₁₀concentrations, and (iv) a sensor for temperature and relative humidity in the air. The CO₂ and CH₄ sensors have been calibrated in-field based on parallel measurements with the Picarro G2201-i and elaborating the calibration data with the Random Forest machine learning-based algorithm.

The measurements carried out along the transepts showed that the downstream areas next to the metropolitan city of Florence were affected by the highest concentrations of CO₂ and CH₄, marked by isotopic signatures revealing a clear anthropogenic origin, mainly ascribed to vehicular traffic. The distribution of these carbon species reflected the evolution of the atmospheric boundary layer, displaying higher concentrations during the early morning, when gas accumulation occurred due to stable atmospheric conditions, and lower concentrations during daytime when the heating of the surface favored the dilution of air pollutants due to the establishment of convective turbulence. These observations were confirmed by the network of low-cost stations, which allowed to simultaneously monitor the distribution of the atmospheric pollutants at different altitudes in the valley. The distribution of particulate was consistent with that of the gaseous species, and the main sources were clearly distinguished based on the chemical composition of the atmospheric deposition in the collection sites. The promising results from the present study could result in an affordable approach to effectively improve air quality monitoring strategies and support data-driven policy actions to reduce carbon emissions.

How to cite: Biagi, R., Ferrari, M., Tassi, F., and Venturi, S.: Multi-instrumental approach for air quality monitoring: characterization and distribution of greenhouse gases and atmospheric metal deposition in the Greve River Basin (Chianti territory, Central Italy)., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11385, https://doi.org/10.5194/egusphere-egu23-11385, 2023.

EGU23-13997 | ECS | Posters on site | GI1.3

Correction, gap filling and homogenization on daily level of the historical DMI station network temperature data

Dina Rapp, Bo Møllesøe Vinther, Jacob L. Høyer, and Eigil Kaas

As climate change is amplified in the Arctic, it is crucial to have temperature records of high temporal resolution and quality in this area. This will help improve understanding of the involved physical mechanisms, assessment of the past changes and improve predictions for the future temperature development in the Arctic. In this study temperature measurements from the DMI Greenland station network spanning 1784-present day are corrected, gap-filled and homogenized on a daily level. Currently homogenized data is only available on a monthly level, and the more recent data has not been homogenized. The data is currently used for purposes like assessment and predictions of the surface mass balance of the Greenland Ice Sheet, temperature/climate reanalyses, validation of proxy data, etc.

This study presents a method for improving the calculation of daily average temperatures, from the current practice of averaging the available measurements without considering what time of day they are from and how the measurements are distributed. The method is based on a moving average taking into consideration time of day, time of year and latitude/longitude of the station in question. An estimate of the related uncertainty is also calculated. Following the generation of daily average temperatures, different gap filling methods are tested. The different algorithms tested and compared are: simple gap filling by linear interpolation with other stations, single station temporal linear interpolation and MEM (Maximum Entropy Method). Finally, homogenization on daily level is performed. These steps will in turn also improve the monthly and annual average temperatures for the DMI Greenland station network.

How to cite: Rapp, D., Møllesøe Vinther, B., L. Høyer, J., and Kaas, E.: Correction, gap filling and homogenization on daily level of the historical DMI station network temperature data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13997, https://doi.org/10.5194/egusphere-egu23-13997, 2023.

EGU23-14442 | Orals | GI1.3 | Highlight

The WMO Global Atmosphere Watch Programme new implementation plan and strategic objectives

Sergi Moreno

The Global Atmosphere Watch (GAW) Programme was established in 1989 in recognition of the need for improved scientific understanding of the increasing influence of human activities on atmospheric composition and subsequent societal impacts. It is implemented as an activity of the World Meteorological Organization, a specialized agency of the United Nations system, and is funded by the organization member countries.

As an international programme, GAW supports a broad spectrum of applications from atmospheric composition-related services to contribution to environmental policy. The examples of the later include provision of a comprehensive set of high quality and long-term globally harmonized observations and analysis of atmospheric composition for the United Nations Framework Convention on Climate Change (UNFCCC), the Montreal Protocol on Substances that Deplete the Ozone Layer and follow-up amendments, and the Convention on Long-Range Transboundary Air Pollution (CLRTAP).

The programme includes six focal areas: Greenhouse Gases, Ozone, Aerosols, Reactive Gases, Total Atmospheric Deposition and SolarUltraviolet Radiation.

The surface-based observational network of the programme includes Global (31 stations) and Regional (about 400 stations) stations where observations of various GAW parameters occur. These stations are complemented by regular ship cruises and various contributing networks. All observations are linked to common reference standards and the observational data are made available at seven designated World Data Centres (WDC).

Surface-based observations are complemented by airborne and space-based observations that help to characterize the upper troposphere and lower stratosphere, with regards to ozone, solar radiation, aerosols, and certain trace gases.

Requirements to become a GAW station are detailed in the GAW Implementation Plan 2016-2023 (WMO, 2017). A new IP is in preparation, the four strategic objectives will be presented.

The GAW Quality Management comprises: Data Quality Objectives, Measurement Guidelines, Standard Operating Procedures and Data Quality Indicators. Throughout the programme the common quality assurance principles apply, that include requirements for the long-term sustainability of the observations, use of one network standard for each variable and implementation of the measurement practices that satisfy the set data quality objectives. GAW implements open data policy and requires observational data be made available in the dedicated data centers operated by WMO Member countries.

The programme relies on different types of central facilities: Central Calibration Laboratories, Quality Assurance/Science Activity Centres, World and Regional Calibration Centres, which are also directly supported and implemented by the individual Member countries for the global services.

Majority of the recommendations regarding measurement and quality assurance procedures are developed by the expert and advisory groups within the programme, often those rely on the expertise withing the contributing networks and collaborating organizations, like the Aerosol, Clouds and Trace Gases Research Infrastructure (ACTRIS) or the Integrated Carbon Observation System (ICOS).

One of the GAW priorities is to expand and strengthen partnerships with contributing networks, through development of statements and strategies to articulate the mutual benefits for the collaborations and stream-line processes of data reporting and exchange of QA standards and metadata. This involves collaboration with national and regional environmental protection agencies and the development of harmonized metadata and data exchange and quality information.

How to cite: Moreno, S.: The WMO Global Atmosphere Watch Programme new implementation plan and strategic objectives, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14442, https://doi.org/10.5194/egusphere-egu23-14442, 2023.

EGU23-14459 | ECS | Orals | GI1.3

Developing and testing a validation procedure to successfully use on-the-move sensors in urban environments

Francesco Barbano, Erika Brattich, Carlo Cintolesi, Juri Iurato, Vincenzo Mazzarella, Massimo Milelli, Abdul Ghafoor Nizamani, Maryam Sarfraz, Antonio Parodi, and Silvana Di Sabatino

With the increasing attempt to empower citizens and civil society in promoting virtuous behaviours and relevant climate actions, novel user-friendly and low-cost tools and sensors are nowadays being developed and distributed on the market. Most of these sensors are typically easy to install with a ready-to-use system, while measured data are automatically uploaded on a mobile application or a web dashboard which also guarantees secure and open access to measurements gathered by other users. However, the quality of the datum and the calibration of these sensors are often ensured against research-grade instrumentations only in the laboratory and rarely in real-world measurement. The discrepancies arising between these low-cost sensors and research-grade instrumentations are such that the first might be impossible to use if a validation (and re-calibration if needed) under environmental conditions is not performed. Here we propose a validation procedure applied to the MeteoTracker, a recently developed portable sensor to monitor atmospheric quantities on the move. The ultimate scope is to develop and implement a general procedure to test and validate the quality of the MeteoTracker data to compile user guidelines tailored for on-the-move sensors. The result will evaluate the feasibility of MeteoTracker (and potentially other on-the-move sensors) to integrate the existing monitoring networks on the territory, improve the atmospheric data local coverage and support the informed decision by the authorities. The procedure will include multi-sensor testing of all the sensor functionalities, validation of all data simultaneously acquired by several sensors under similar conditions, methods and applications of comparisons with research-grade instruments. The first usage of the MeteoTracker will be also presented for different geographical contexts where the sensors will be used for citizen science activities and develop a monitoring network of selected Essential Variables within the HORIZON-EU project I-CHANGE (Individual Change of HAbits Needed for Green European transition).

How to cite: Barbano, F., Brattich, E., Cintolesi, C., Iurato, J., Mazzarella, V., Milelli, M., Nizamani, A. G., Sarfraz, M., Parodi, A., and Di Sabatino, S.: Developing and testing a validation procedure to successfully use on-the-move sensors in urban environments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14459, https://doi.org/10.5194/egusphere-egu23-14459, 2023.

EGU23-15087 | Posters on site | GI1.3

Applications of an advanced clustering tool for EU AQ monitoring network data analysis

Joana Soares, Christoffer Stoll, Islen Vallejo, Colin Lee, Paul Makar, and Leonor Tarrasón

Air quality monitoring networks provide invaluable data for studying human health, environmental impacts, and the effects of policy changes. In a European legislative context, the data collected constitutes the basis for reporting air quality status and exceedances under the Ambient Air Quality Directives (AAQD) following specific requirements. Consequently, the network's representativity and ability to accurately assess the air pollution situation in European countries become a key issue. The combined use of models and measurements is currently understood as the most robust way to map the status of air pollution in an area, allowing it to quantify both the spatial and temporal distribution of pollution. This spatial-temporal information can be used to evaluate the representativeness of the monitoring network and support air quality monitoring design using hierarchical clustering techniques.

The hierarchical clustering methodology applied in this context can be used as a screening tool to analyse the level of similarity or dissimilarity of the air concentration data (time-series) within a monitoring network. Hierarchical clustering assumes that the data contains a level of (dis)similarity and groups the station records based on the characteristics of the actual data. The advantage of this type of clustering is that it does not require an a priori assumption about how many clusters there might be, but it can become computationally expensive as the number of time-series increases in size. Three dissimilarity metrics are used to establish the level of similarity (or dissimilarity) of the different air quality measurements across the monitoring network: (1) 1-R, where R is the Pearson linear correlation coefficient, (2) the Euclidean distance (EuD), and (3) multiplication of metric (1) and (2). The metric based on correlation assesses dissimilarities associated with the changes in the temporal variations in concentration. The metric based on the EuD assesses dissimilarities based on the magnitude of the concentration over the period analysed. The multiplication of these two metrics (1-R) x EuD assesses time variation and pollution levels correlations, and it has been demonstrated to be the most useful metric for monitoring network optimization.

This study presents the MoNET webtool developed based on the hierarchical clustering methodology. This webtool aims to provide an easy solution for member states to quality control the data reported as a tier-2 level check and evaluate the representativeness of the air quality network reporting under the AAQD. Some examples from the ongoing evaluation of the monitoring site classification carried out as a joint exercise under the Forum for Air Quality Modeling (FAIRMODE) and the National Air Quality Reference Laboratories Network (AQUILA) are available to show the usability of the tool. MoNet should be able to identify outliers, i.e., issues with the data or data series with very specific temporal-magnitude profiles, and to distinguish, e.g., pollution regimes within a country and if it resembles the air quality zones required by the AAQD and set by the member states; stations monitoring high-emitting sources; background regimes vs. a local source driving pollution regime in cities.

How to cite: Soares, J., Stoll, C., Vallejo, I., Lee, C., Makar, P., and Tarrasón, L.: Applications of an advanced clustering tool for EU AQ monitoring network data analysis, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15087, https://doi.org/10.5194/egusphere-egu23-15087, 2023.

EGU23-15609 | ECS | Posters on site | GI1.3

A compact and customisable street-level sensor system for real-time weather monitoring and outreach in Freiburg, Germany

Gregor Feigel, Marvin Plein, Matthias Zeeman, Ferdinand Briegel, and Andreas Christen

Climate adaptation and emergency management are major challenges in cities, that benefit from the incorporation of real-time weather, air quality, differential exposure and vulnerability data. We therefore need systems that allow us to map, for example, localised thermal heat stress, heavy precipitation events or air quality spatially resolved across cities at high temporal resolution. Key to the assessment of average conditions and weather extremes in cities are systems that are capable of resolving intra-urban variabilities and microclimates at the level of people, hence in the urban canopy layer at street-level. Placing sensors at street-level, however, is challenging: Sensors need to be small, rugged, safe, and they must measure a number of quantities within limited space. Such systems may ideally require little or no external power, provide remote accessibility, sensor interoperability and real-time data transfer and must be cost-effective for mass deployment. However, these characteristics as well as a wide spectrum of observed variables are not available in current commercial sensor network solutions, hence we designed and implemented a custom partly in-house developed two-tiered sensor system for mounting and installation at 3 m height on city-owned street lights in Freiburg, Germany.

Our partly in-house developed two-tiered sensor network, consisting of fifteen fully self-developed, cost-effective “Tier-I stations” and 35 commercial “Tier-II stations” (LoRAIN, Pessl Instruments GmbH), aims to fill these gaps and to provide a modular, user-friendly WSN with a high spatial density and temporal resolution for research, practical applications and the general public. The Tier-I stations were designed and optimised from the ground up, including the printed circuit board (PCB), for temporally high-resolution WSNs that support wide ranges of sensors and that is expandable. The core of the system is a low-power embedded computer (Raspberry Pi Zero) running a custom multithreaded generic logging and remote control software that locally stores the data and transmits it to a custom vapor-based TCP server via GSM. The software also features system monitoring and error detection functions, as well as remote logging. The setup can easily be expanded on the fly by adding predefined sensors to a configuration file. For better modularity, each station registers itself on the server and will be automatically integrated in all further processes and vice versa. Custom frontends as well as bidirectional communication and task distribution protocols enable remote access and across node interaction, resulting in a more easy-to-maintain system.

In addition to air temperature, humidity and precipitation measured by the Tier II stations, the Tier-I station feature a ClimaVUE 50 all-in-one weather sensor and a BlackGlobe (Campbell Scientific, Inc.) that provides data on wind, radiation, pressure, lightning, solar radiation and black globe temperatures. That allows for calculation of thermal comfort indices in real-time. A webpage and the self-developed “uniWeather” (iOS-App, API) offers near-realtime data access and data interpretation for stakeholders and public outreach.

How to cite: Feigel, G., Plein, M., Zeeman, M., Briegel, F., and Christen, A.: A compact and customisable street-level sensor system for real-time weather monitoring and outreach in Freiburg, Germany, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15609, https://doi.org/10.5194/egusphere-egu23-15609, 2023.

EGU23-16779 | Orals | GI1.3 | Highlight

An integrated meteorological forecasting system for emergency response

Alexander Haefele, Maxime Hervo, Philipp Bättig, Daniel Leuenberger, Claire Merker, Daniel Regenass, Pirmin Kaufmann, and Marco Arpagaus

EMER-Met is the new meteorological forecasting system for the protection of the population in Switzerland. It provides the meteorological basis for coping with all types of emergencies, especially in case of nuclear and chemical accidents. EMER-Met consists of a dedicated upper air measurement network and a high-resolution numerical weather prediction model. The measurement network is composed of state-of-the-art remote sensing instruments to measure accurate wind and temperature profiles in the boundary layer. At three sites, a radar wind profiler PCL1300, a Doppler lidar Windcube-200s and a microwave radiometer Hatpro-G5 are installed. The data from the measurement network are assimilated into the operational 1-km ensemble numerical weather prediction (NWP) system. In the case of the microwave radiometers, we assimilate the brightness temperatures using an adapted version of the RTTOV observation operator. To ensure best impact on the NWP results, the data quality of the measurements is of high importance and is monitored closely on a daily and monthly basis against radiosondes and the NWP model itself. EMER-Met is operational since 2022 and to our best knowledge, it is the first time that the brightness temperatures measured by surface-based microwave radiometers are assimilated operationally. This presentation will focus on the upper air network performance and its impact on NWP.

How to cite: Haefele, A., Hervo, M., Bättig, P., Leuenberger, D., Merker, C., Regenass, D., Kaufmann, P., and Arpagaus, M.: An integrated meteorological forecasting system for emergency response, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16779, https://doi.org/10.5194/egusphere-egu23-16779, 2023.

EGU23-17535 | ECS | Orals | GI1.3

ACTRIS - CiGas side-by-side interlaboratory comparison of new and classical techniques for formaldehyde measurement in the nmol/mol range

Therese Salameh, Emmanuel Tison, Evdokia Stratigou, Sébastien Dusanter, Vincent Gaudion, Marina Jamar, Ralf Tillmann, Franz Rohrer, Benjamin Winter, Teresa Verea, Amalia Muñoz, Fanny Bachelier, Véronique Daele, and Audrey Grandjean

Formaldehyde is an important hazardous air pollutant, classified as carcinogenic to humans by the International Agency for Research on Cancer (IARC). It is emitted directly by many anthropogenic and natural sources, and formed as a secondary product from volatile organic compounds (VOCs) photo-oxidation. Formaldehyde is, as well, a significant source of radicals in the atmosphere resulting in ozone and secondary organic aerosols formation. Routine measurements of formaldehyde in regulatory networks within Europe (EMEP) and USA (EPA Compendium Method TO 11A) rely on sampling with DNPH (2,4-Dinitrophenylhydrazine)- impregnated silica cartridges, followed by analysis with HPLC (High-performance liquid chromatography).

In the framework of the EURAMET-EMPIR project « MetClimVOC » (Metrology for climate relevant volatile organic compounds: http://www.metclimvoc.eu/), the European ACTRIS (Aerosol, Cloud and Trace Gases Research InfraStructure: https://www.actris.eu/) Topical Centre for Reactive Trace Gases in-situ Measurements (CiGas), IMT Nord Europe unit – France, organized a side-by-side intercomparison campaign in Douai-France, dedicated to formaldehyde measurement in a low amount fraction range of 2 - 20 nmol/mol, from 30 May to 8 June 2022. The objectives of the intercomparison are to evaluate the instruments metrological performance under the same challenging conditions, and to build best practices and instrumental knowledge.

Here, we present the results from the intercomparison, where ten instruments belonging to seven different techniques were challenged with the same formaldehyde gas mixture generated either from a cylinder or from a permeation system, in different conditions (amount fractions, relative humidity, interference, blanks, etc.), flowing through a high-flow (up to 50 L/min) Silcosteel-coated manifold. The advantages/drawbacks of the techniques will be discussed.

How to cite: Salameh, T., Tison, E., Stratigou, E., Dusanter, S., Gaudion, V., Jamar, M., Tillmann, R., Rohrer, F., Winter, B., Verea, T., Muñoz, A., Bachelier, F., Daele, V., and Grandjean, A.: ACTRIS - CiGas side-by-side interlaboratory comparison of new and classical techniques for formaldehyde measurement in the nmol/mol range, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-17535, https://doi.org/10.5194/egusphere-egu23-17535, 2023.

EGU23-1083 | ECS | Orals | GI6.3

Exploring the ‘Individual Treatment Effects’ (ITE) of Vegetation with Causal Inference on Soil Organic Carbon Prediction in Germany

Nafiseh Kakhani, Thomas Gläßle, Ruhollah Taghizadeh-Mehrjardi, Ndiye Michael Kebonye, and Thomas Scholten

Carbon is an essential element and contributor to healthy soil conditions as well as ecological soil function and productivity. Additionally, carbon is a component of all plants and animals on the planet and is a necessary component of life. Natural vegetation serves as a significant but highly dynamic carbon sink. When vegetation is removed quicker than it can regenerate, for example by harvesting crops or timber, soil carbon is depleted. Thus, understanding the environmental effects and dynamics of loss of vegetation is a crucial prerequisite to turning our natural resource management from a carbon emitter to a carbon sink to avoid that and achieve sustainability. At the same time, the spatial distribution of soil organic carbon is also highly heterogeneous, with variations in climate, other soil characteristics, and land use/land cover affecting how our ecosystem reacts to the loss of vegetation. Thus, to effectively improve green metrics and contribute to the creation of future policies, it is required to conduct research on the changes in vegetation and their effect on soil organic carbon and provide regionally appropriate management advice. Here, in this research, our goal is to examine the "individual treatment effects" (ITE), which are a personalized or individualized effect estimation of one variable on the output, and utilize causal inference to address them. Using the LUCAS dataset, we explore the heterogeneous treatment effect of percent tree coverage (PTC), as a parameter of the density of trees on the ground, on the soil organic carbon content in Germany. We do this by leveraging some parameters, such as climate data, land use/land cover information, and other information from the soil. We thus offer a data-driven viewpoint for focusing on sustainable behaviors and effectively increasing soil organic carbon content levels.

How to cite: Kakhani, N., Gläßle, T., Taghizadeh-Mehrjardi, R., Kebonye, N. M., and Scholten, T.: Exploring the ‘Individual Treatment Effects’ (ITE) of Vegetation with Causal Inference on Soil Organic Carbon Prediction in Germany, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1083, https://doi.org/10.5194/egusphere-egu23-1083, 2023.

EGU23-1444 | ECS | Posters on site | GI6.3

Lava flow mapping using Sentinel-1 SAR time series data: a case study of the Fagradalsfjall eruptions

Zahra Dabiri, Daniel Hölbling, Sofia Margarita Delgado-Balaguera, Gro Birkefeldt Møller Pedersen, and Jan Brus

Lava flows can threaten populated areas, cause casualties and considerable economic damage. Therefore, understanding lava flows and their evolution is important because they can be linked to lava transport systems and eruption parameters. However, timely and accurate lava flow mapping in the field can be time-consuming and dangerous. Earth observation (EO) data plays an important role in improving lava flow mapping and monitoring. Synthetic Aperture Radar (SAR) data provide a unique opportunity to study lava flows, especially in areas with high cloud coverage during the year. Moreover, smoke and ash clouds can be partially penetrated by SAR. The freely available Sentinel-1 SAR data (C-band), with its high temporal and spatial resolution, opens new opportunities for studying lava flow evolution and lava morphology. However, Sentinel-1 data have mainly been used to study surface deformation using Differential Interferometric SAR (DInSAR) techniques, and the utilisation of SAR backscatter information for lava flow characterisation has not been thoroughly exploited.

The Fagradalsfjall volcanic system is located on the Reykjanes Peninsula in southwest Iceland. The eruption began on the 19^th of March and lasted until the 18^th of September 2021. The resulting lava flows cover an area of 4.8 km² (Pedersen et al., 2022). Another eruption occurred in August 2022. We used time series of dual-polarisation, including VH (antenna sends vertical pulses and receives horizontal backscatter) and VV (antenna sends vertical pulses and receives horizontal backscatter), Sentinel-1 data to study the changes in lava flow extent and morphology during the 2021 and 2022 Fagradalsfjall eruption phases. The pre-processing of Sentinel-1 data included orbit state vector correction, radiometric calibration to reduce the radiometric biases caused by topographic variations, co-registration, and range doppler terrain correction. In addition to backscatter polarisations, we calculated the image texture using the grey-level co-occurrence matrix (GLCM) algorithm, including several measures such as contrast, homogeneity, and entropy. We used object-based segmentation and classification algorithms to delineate the lava extent and evaluated the applicability of different polarisations. To validate the mapping results, we used reference layers derived from high-resolution optical images available from Pedersen et al. (2022). The results showed that cross-polarisation was the most suitable for mapping the extent of lava. Additionally, the integration of texture information allowed us to distinguish lava types to some extent.

The results demonstrate the potential and challenges of utilising SAR backscatter information from Sentinel-1 data for studying the spatio-temporal lava flow evolution and mapping lava flow morphology, especially when the applicability of optical EO data is limited.

Pedersen, G. B. M., Belart, J. M. C., Óskarsson, B. V., Gudmundsson, M. T., Gies, N., Högnadóttir, T., et al. (2022). Volume, Effusion Rate, and Lava Transport During the 2021 Fagradalsfjall Eruption: Results From Near Real-Time Photogrammetric Monitoring. Geophysical Research Letters, 49, 13, e2021GL097125. https://doi.org/10.1029/2021GL097125

How to cite: Dabiri, Z., Hölbling, D., Delgado-Balaguera, S. M., Pedersen, G. B. M., and Brus, J.: Lava flow mapping using Sentinel-1 SAR time series data: a case study of the Fagradalsfjall eruptions, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1444, https://doi.org/10.5194/egusphere-egu23-1444, 2023.

EGU23-1501 | Orals | GI6.3

An unbiased spatiotemporal fusion approach to generate daily 100 m spatial resolution land surface temperature over a continental scale

Yi Yu, Luigi Renzullo, Siyuan Tian, and Brendan Malone

High spatial resolution land surface temperature (LST) (<= 100 m) has a considerable significance for small scale studies like agricultural applications and urban heat island studies. Originally developed for optical data, spatiotemporal fusion methods, such as the widely used Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) and the Enhanced STARFM (ESTARFM), are gradually becoming promising approaches to generate high resolution thermal variables but still have shortcomings, such as an invalid assumption in thermal fields and the accumulation of systematic biases. Hence, we proposed a variant of the ESTARFM algorithm, referred as the unbiased ESTARFM (ubESTARFM), aiming to better accommodate the spatiotemporal approach to thermal studies. We evaluated the results derived from our method and the typical ESTARFM against both in-situ LST and the ECOsystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS) LST over a continental scale of Australia. The results show that the ubESTARFM has a bias of 2.55 K, unbiased RMSE (ubRMSE) of 2.57 K, and Pearson correlation coefficient (R) of 0.95 against the in-situ LST over 11290 samples at 12 sites, all of which are significantly better than that of the ESTARFM, with a bias of 4.73 K, ubRMSE of 3.80 K and R of 0.92. In the cross-satellite comparison, the ubESTARFM LST has a bias of -1.69 K, ubRMSE of 2.00 K, and R of 0.70 over 43 near clear-sky scenes, while the ESTARFM LST has a bias of 1.79 K, ubRMSE of 2.68 K, and R of 0.59. Overall, the ubESTARFM is able to avoid the accumulation of systematic bias, considerably reduce the deviation of uncertainty, and maintain a good level of correlation with validation datasets compared to the typical ESTARFM algorithm. It is a promising method to integrate reliable numeric values from coarse resolution LST and spatial heterogeneity from fine resolution LST, and may be further coupled with energy balance or radiative transfer models to better enable farm- or regional-scale water management strategy or decision making.

How to cite: Yu, Y., Renzullo, L., Tian, S., and Malone, B.: An unbiased spatiotemporal fusion approach to generate daily 100 m spatial resolution land surface temperature over a continental scale, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1501, https://doi.org/10.5194/egusphere-egu23-1501, 2023.

EGU23-2114 | Orals | GI6.3 | Highlight

Tracking tillage practices across European croplands using multi-scale remote sensing and machine learning.

Nathan Torbick, Aoife Whelan, Nick Synes, Xiaodong Huang, and Vincent Cornwell

The adoption of regenerative agricultural practices is gaining traction as an approach to enhance soil health and sequester carbon to combat climate change. Several sustainability frameworks and programmes are now incentivizing producers to transition to regenerative farming. These evolving initiatives have created a need to build and operate Measurement, Reporting and Verification (MRV) platforms to track cropland practices and impacts. To help scale initiatives, we have developed an automated approach that leverages multi-source remote sensing, data science and machine learning for cost-effective, robust and transparent tracking of tillage practices. Our approach leverages time-series satellite observations from Sentinel-1 and Sentinel-2 constellations, along with ancillary data from SMAP, soils and weather. Within a hierarchical classification, these inputs are blended with dense, independent training data (i.e., “ground truth”) collected across Europe with tens of thousands of samples gathered across France, Belgium, Denmark and the UK. Training data includes observations of crop types and rotations, residue, soil disturbance and field conditions. Together, these multi-source data feed into gradient boosting and Convolutional Neural Networks to ultimately help seasonally classify tillage practices into conventional, reduced or no till at field scale for all major row crops. Withheld independent observations and data science best practices are used to tune model performance and class accuracy depending on regional schemes, residue categories and landscape practice variability. F1 score and Overall Accuracy achieve > 80% with some crop and tillage practice combinations (i.e. corn, soy, wheat conventional) > 0.9. In addition, we share lessons learnt and next challenges. With this approach, the Community of Practice can robustly track every field wall-to-wall over seasons and feed downstream applications, such as estimating Soil Organic Carbon and emissions process modelling. With these tools, and open operational data streams such as Copernucis, we can support scaling regenerative agriculture impacts and grow carbon farming initiatives and ecosystem service markets across Europe.

How to cite: Torbick, N., Whelan, A., Synes, N., Huang, X., and Cornwell, V.: Tracking tillage practices across European croplands using multi-scale remote sensing and machine learning., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2114, https://doi.org/10.5194/egusphere-egu23-2114, 2023.

EGU23-2199 | Posters on site | GI6.3

The Operation and Service of National Land Satellite 1

Hyewon Yun, Yun-Soo Choi, and Sunghee Joo and the National Geographic Information Institute Korea Land Satellite Center

Korean National Land Satellite 1 has been launched with a mission to map national geospatial information and to monitor land resource and disasters on March 22, 2021. The satellite has a precise optical payload of 5 multi-spectral bands (Pan, R, G, B, and NIR). It observes the ground of 12 kilometers width at a 0.5m GSD (Ground Sample Distance) mainly over the Korean Peninsula and global areas of interest during at least four years.

The product of National Land Satellite is classified to 4 levels: Basic geometry image based on initial satellite position (Level 1); Precise Ortho-rectified image (Level 2);

Reproduced 2D/3D information only with Level 2 (Level 3); and Reproduced 2D/3D information with Precise image(Level 2/3) and other spatial information (Level 4). As the first 0.5m-scale satellite, Level 1 and Level 2 products are open and accessible to the Korean public. In case of Level 2 product, the average location accuracy shows about 1~4m in Korea, depending on the number of available Ground Control Points (GCP) and Level 2 product will produce North Korea Digital Map at 1:5,000 scale. The level 3 and level 4 will be serviced to the public in stage from 2023. The Korea national land satellite can be used to monitor disaster damage, especially for monitoring climate change caused by increasing greenhouse gas emissions through increasing plastic waste. In addition, it is expected that it can be used to generate high value-added spatial information such as 3D spatial information through convergence between various spatial information and land satellite information.

Acknowledgment: This work was supported by Ministry of Land, Infrastructure and Transport (MOLIT) of Korean government and Korea Environment Industry & Technology Institute (KEITI) through Plastic-Free Specialized Graduate School funded by Korea Ministry of Environment (MOE).

Keywords : #National Land Satellite, #CAS500, #High resolution, #0.5m, #Diaster #plastic waste #climate change

How to cite: Yun, H., Choi, Y.-S., and Joo, S. and the National Geographic Information Institute Korea Land Satellite Center: The Operation and Service of National Land Satellite 1, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2199, https://doi.org/10.5194/egusphere-egu23-2199, 2023.

EGU23-3070 | ECS | Posters on site | GI6.3

A Study on Fine Particle Emission Characteristics in Dangjin Port Using High Resolution Scanning LiDAR

Yuseon Lee, Jaewon Kim, and Youngmin Noh

As emissions from ships and marine sources account for a high proportion of fine particle emissions, interest in air pollutants generated in port areas and the need to prepare countermeasures are increasing. For port air pollutants, it is necessary to consider substances emitted from ships and various emission sources from the yard around ports. This study uses a scanning LiDAR system capable of observing PM10 and PM2.5 in a radius of up to 5 km at a high resolution of 30 m horizontally and left and right to check high-concentration pollutants generated around Dangjin Port(36.985476°N, 126.745613°E) in real time and corresponding substances tried to distinguish. The scanning LiDAR used in this study provides the Ångström exponent calculated from the extinction coefficient at both wavelengths of 1064 and 532 nm and the depolarization ratio at 532 nm. First, the Ångström exponent can confirm information about the particle size. In addition, the depolarization ratio is a parameter representing information on the asphericity of particles. It provides information on the classification of aerosol types depending on whether the particles are spherical or non-spherical. The concentration of fine particle generated was identified using the extinction coefficient, and the kind of particle was determined using the Ångström exponent and the depolarization ratio. The primary source of fine particle in the vicinity of Dangjin Port was an industrial complex, such as a steel mill located on the west side of Dangjin Port, and fine particle was also generated from the port's coal yard and moving ships. The diffusion direction of fine particle was closely related to the wind direction. The type of fine particle confirmed by a low Ångström exponent between 0 and 1 and a high depolarization ratio degree between 0.1 and 0.2 was confirmed as non-spherical scattering dust. Through this study, it was confirmed that it was possible to identify the generation and movement of fine particle in a wide area and to distinguish the types of particles using scanning lidar.

Acknowledgement

This work was supported by the “Graduate school of Particulate matter specialization.” of Korea Environment Industry & Technology Institute grant funded by the Ministry of Environment. Republic of Korea.

How to cite: Lee, Y., Kim, J., and Noh, Y.: A Study on Fine Particle Emission Characteristics in Dangjin Port Using High Resolution Scanning LiDAR, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3070, https://doi.org/10.5194/egusphere-egu23-3070, 2023.

EGU23-3071 | Posters on site | GI6.3

A Study on analysis of fine particle Distribution in Busan Port Area Using Scanning LiDAR

Jaewon Kim, Juseon Shin, Shohee Joo, and Youngmin Noh

Busan is Korea's largest port city. Considering the large size of the port, measurement through a single monitoring station has limitations in expressing the spatial distribution of fine particles. In this study, a Scanning LiDAR system was used to overcome the limitations of existing observations. Scanning LiDAR is a remote sensing device that uses a laser as a light source to calculate distance information. It can calculate fine particle mass concentration and distance information through signal analysis of collected light from laser light scattered backward by fine particles. It is possible to observe the fine particle concentration in real-time and continuously for 24 hours at a resolution of 30 m within a radius of 5 km and to check the spatial distribution of particulate matter using this. Scanning LiDAR is located on the rooftop of the 9th Engineering Building, Yongdang Campus, Pukyong National University, Korea (latitude: 35.11, longitude: 129.09, about 10m above ground), and was observed from March 2nd to April 28th, 2022. Residential areas, ports, industrial facilities, etc., are included in the observation range, and the average fine particle concentration by area was obtained by dividing it into six areas. ( (A) residential area, (B) steel mill, (C) Gamman Port, (D) redevelopment area, (E) shipyard, (F) berth ). Areas A, B, and C are located to the northeast of the port area, while Areas D, E, and F are located to the west and southwest. As a result of observation, the average concentration of PM2.5 and PM10 in the A, B, and C areas tended to be higher than those in D, E, and F. In the case of Area A, despite being residential, it has a high average concentration. This is because the fine particle is emitted from Area C, where ships and loading equipment are located, and Area B, where steel mills are located. This can be attributed to the diffusion and movement of fine particles discharged from the port area to the downwind side due to the influence of the south wind series, which is the main wind during the observation period.

Acknowledgement

This work was supported by the "Graduate school of Particulate matter specialization"of Korea Environment Industry & Technology Institute grant funded by the Ministry of Environment, Republic of Korea.

How to cite: Kim, J., Shin, J., Joo, S., and Noh, Y.: A Study on analysis of fine particle Distribution in Busan Port Area Using Scanning LiDAR, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3071, https://doi.org/10.5194/egusphere-egu23-3071, 2023.

EGU23-3687 | Posters on site | GI6.3

Crop type mapping in Central and South Asia using Sentinel-1 and Sentinel-2 remote sensing data

Christoph Raab and Viet Duc Nguyen

Crop type information derived from satellite remote sensing are of pivotal importance for quantifying crop growth and health status. However, such spatial information are not readily available for countries in Central and South Asia, where smallholder farmers play a dominant role in agricultural practice, and food security. In this study, we provide insights into crop type mapping for three study sites in the region: 1) Panfilov District in Kazakhstan, 2) Jaloliddin Balkhi District in Tajikistan, and 3) Multan District in Pakistan. A collection of Sentinel-2 and Sentinel-1 satellite data was used along with the random forest classification algorithm. To train and validate the classification model, field data were collected between May and October 2022 in each of the study areas. Our main objective was to evaluate the performance of a combined Sentinel-2 and Sentinel-1 mapping approach in comparison to a single source result. In addition, this contribution will provide insights into the performance with regard to crop type mapping accuracy of different temporal data aggregation intervals. Preliminary results indicate a small increase in overall accuracy for a combined Sentinel-2 and Sentinel-1 mapping approach. However, Sentinel-2 data might be sufficient for reliable crop type mapping, in case cloud coverage is not a constraint. Future studies might consider evaluating the potential benefit of using a full Sentinel-1 data set without temporal aggregation for mapping crop types.

How to cite: Raab, C. and Nguyen, V. D.: Crop type mapping in Central and South Asia using Sentinel-1 and Sentinel-2 remote sensing data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3687, https://doi.org/10.5194/egusphere-egu23-3687, 2023.

EGU23-3777 | ECS | Posters on site | GI6.3

Analyzing Land Use/Land Cover Changes and its Dynamics Using Remote Sensing Data: A case study of Gabala, Azerbaijan

Bahruz Ahadov and Nilufar Karimli

To track global environmental change and evaluate the risk to sustainable development, analysts and decision-makers in government, civil society, finance, and industry need the fundamental geospatial data products known as Land Use and Land Cover Change (LULCC) maps. Our research studied LULCC variations in a timeframe of 5 years in the Gabala district. Sentinel 2 open-source products were used to compare and categorize the procedure over one-year time intervals. For this investigation, the discrete indexing method was developed and used. The approach we used was focused on obtaining multiple indices and using them to improve classification performance. The Normalized Difference Vegetation Index (NDVI), Modified Normalized Difference Water Index (MNDWI), Bare Soil Index (BSI), Normalized Difference Tillage Index (NDTI), and Salinity Index (SI) are the indices evaluated. The most crucial variables were determined and classified using the random forest classifier in LULCC. The Sentinel Application Platform of the European Space Agency (SNAP ESA) algorithm was used to analyze the process and performed over 90% accurate predictions when applied to the testing dataset. Results revealed that using the RS technique, time and cost-efficient analyses are possible and reliable for developing socioeconomic and ecological growth strategies.

How to cite: Ahadov, B. and Karimli, N.: Analyzing Land Use/Land Cover Changes and its Dynamics Using Remote Sensing Data: A case study of Gabala, Azerbaijan, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-3777, https://doi.org/10.5194/egusphere-egu23-3777, 2023.

EGU23-4049 | ECS | Orals | GI6.3

Comparison of coverage obtained by land use classification using landsat and RapidEye. Case study: Tenosique, Tabasco, Mexico.

Jacob Nieto, Nelly Lucero Ramírez Serrato, Mariana Patricia Jácome Paz, and Tania Ximena Ruiz Santos

Land use classification studies help to quantify the changes in forest cover that may occur at a given site over time. This quantification helps us understand the effect of the natural and anthropogenic processes over the study site. Activities such as agriculture, cattle ranching and illegal logging, which in turn are related to the evolution of the site's public policies, can be evaluated through classification studies. Tenosique area, in the southeast of Mexico, is a clear example of the consequences of these programs, being largely benefited by economic consent for agriculture and more for cattle ranching, and, suffering, in 1974, a complete turn in productivity activities because it was given full support in exploration and obtainment of hydrocarbons. This led to a crisis that left the area devastated and later became a protected area in 2008, which resulted in illegal logging, and land use for agriculture within the tropical forest, among others. With remote sensing, the task of quantifying the effect of public policies has become increasingly influential and many studies are being carried out to evaluate the current state of Tenosique. However, the results are known to depend directly on the images and methodologies used for this task.Because of this, this project, proposes, in a practical exercise, to determine how much these results may vary with respect to the images used as input for the supervised classification, and if this variation is significant enough to establish rules of operation on methodologies and determine ranges of the parameters of the images to perform a better land use classification. The aim of this project is to determine the margin of variability in the classification result over a given study area, using images from different satellite platforms, Landsat and RapidEye, together with the analysis of the properties of each image, when acquired by the satellite. In addition, the degree of affectation in the image by meteorological changes such as tropical haze in the source image and its respective corrected image was evaluated. The main results are: individualization of complications and advantages derived from the resolution of the images, identification of the main steps for the possible corrections that can be needed for the images, advantages that are used for analyzing the metadata before doing some process to the images and finally, presenting a decision tree based on this information. It is important to emphasize that this study allows us to delimit the scope and limitations of the land use classifications made in the study area. Acknowledgments: Tania Ximena for the Planet images and Humberto Abaffy-Castillo, Ulises Gracía-Martínez and Mario Seinos-Jiménez for technical help in the project.

How to cite: Nieto, J., Ramírez Serrato, N. L., Jácome Paz, M. P., and Ruiz Santos, T. X.: Comparison of coverage obtained by land use classification using landsat and RapidEye. Case study: Tenosique, Tabasco, Mexico., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4049, https://doi.org/10.5194/egusphere-egu23-4049, 2023.

EGU23-5481 | ECS | Posters on site | GI6.3

A Study on the background surface reflectance retrieval of near-UV wavelength using GK-2B/GEMS data

Suyoung Sim, Kyung-soo Han, Sungwon Choi, Noh-hun Seong, Daeseong Jung, Jongho Woo, and Nayeon Kim

Surface reflectance is the product of removing atmospheric scattering and absorption effects from the Top-Of-Atmosphere (TOA) radiation using the Radiative Transfer Model (RTM), and it refers to the reflectance according to the solar and satellite zenith angles at the time of observation. Surface reflectance is an essential input data for other Level-2 calculation algorithms such as aerosol, cloud, ozone, gas tracers, etc. Therefore, if the surface reflectance data has missing value, it will lead to missing other products that use it. However, when there are clouds in the satellite image, there is a problem with that blank pixels are generated because the surface reflectance cannot be calculated. Therefore, in this study, we conducted an algorithm to calculate background surface reflectance (BSR) without missing values with high accuracy using GK-2B/Geostationary Environment Monitoring Spectrometer (GEMS) data. The BSR is an estimate of the surface reflectance under specific observation conditions (solar and satellite zenith angles) and is a product that avoids the calculation precedence dilemma between AOD and surface reflectance. In many studies, the BSR is mainly calculated using the minimum reflectance method, but it has limitations in not considering the angular conditions at the time of observation and the reflectance characteristics of the ground surface. To overcome these limitations, a realistic BSR calculation was performed considering the anisotropic reflectance characteristics of the surface according to the observation conditions through bi-directional reflectance distribution function (BRDF) modeling.

Surface reflectance, which is an input variable for BRDF modeling, was calculated based on the Look-Up Table (LUT) generated using the Second Simulation of Satellite Signal in the Solar Spectrum (6SV) RTM. At this time, LUT interpolation was additionally performed through the 6d-interploation technique to resolve discontinuities that may occur in LUT-based atmospheric correction. For BRDF modeling, the kernel-based Roujean model was used, and the optimal synthesis period for BRDF modeling considering the characteristics of the GEMS satellite was selected. To evaluate the accuracy of BSR, the simulated BSR through the BRDF model and the observed surface reflectance were compared, and it was confirmed that the BSR showed higher accuracy than the minimum reflectance method. In the future, the BSR produced through this study is expected to have a great impact on improving the calculation accuracy of aerosol and atmospheric products of GEMS satellites.

How to cite: Sim, S., Han, K., Choi, S., Seong, N., Jung, D., Woo, J., and Kim, N.: A Study on the background surface reflectance retrieval of near-UV wavelength using GK-2B/GEMS data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5481, https://doi.org/10.5194/egusphere-egu23-5481, 2023.

EGU23-5697 | ECS | Posters on site | GI6.3

Applicability evaluation of Spectral Band Adjustment Factor for Cross-Calibration using high resolution optical satellite

NaYeon Kim, Kyung-soo Han, Sungwon Choi, Noh-hun Seong, Daeseong Jung, Suyoung Sim, and Jongho Woo

Currently, research such as time-series vegetation index analysis, disaster monitoring, and aerosol monitoring are being conducted using high-resolution optical satellites. However, since each high spatial resolution satellite has differences in the spectral response of the two sensors, there is a limit of time-series monitoring when using satellite data fusion. In this study, the Spectral Band Adjustment Factor (SBAF) was calculated for Sentinel-2A and Landsat-8, which are high-resolution satellites, and cross-calibration was performed. When combining data from two satellites, it is necessary to overcome the difference in radiometric sensor characteristics of each satellite. The bias due to the difference in the spectral response of the two satellites was corrected through an adjustment factor derived from the EO-1 Hyperion data. As a result of applying SBAF, the difference in value was within 5%. In the future, based on the results derived from this study, it is expected to make a great contribution to continuous monitoring and time series analysis of aerosols including PM2.5.

※ This work was supported by the "Graduate school of Particulate matter specialization." of Korea Environmental Industry & Technology Institute grant funded by the Ministry of Environment, Republic of Korea.

How to cite: Kim, N., Han, K., Choi, S., Seong, N., Jung, D., Sim, S., and Woo, J.: Applicability evaluation of Spectral Band Adjustment Factor for Cross-Calibration using high resolution optical satellite, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5697, https://doi.org/10.5194/egusphere-egu23-5697, 2023.

EGU23-6085 | Posters on site | GI6.3

Tracing the golden tide outbreak in the Yellow Sea and East China Sea over a 15-year period using multi-satellite sensor data

Young Baek Son and Jong-Kuk Choi

The Yellow Sea (YS) and East China Sea (ECS) have the world’s largest supply of floating algae. The golden tides (Sargassum horneri) appear mainly in the YS and ECS, but become entangled as they drift. The floating harmful macroalgae blooms (HMBs) obstructs navigation and is a huge socioeconomic problem in the vicinity of coastal areas. To determine the origin and movement trend of the golden tide in the YS and ECS, the multi-satellite sensor data (e.g. Sentinel-2 and GOCI) was used to detect the floating macroalgae which was determined by the Alternative Floating Algae Index (AFAI, Wang and Hu, 2016) and mapped over the study area using a 15-year data.

The occurrence period of the golden tide from 2008 to 2019 determined that they were found between January and March in the China coast, and the patches of floating macroalgae in Jeju Island and the west coast of Korea were observed between March and May. The macroalgae was detached from the waters near the Yangtze River and Zhejiang Province, China and then floating into the east and north-east ward influenced by the Tsushima warm current or Kuroshio. The build-up of the gold tide was occurred in the middle of the ECS and pile-up of them was in the coast of Korea from March to May. Recently, changes have begun to appear in movement trend of the golden tide. During 2020 and 2021, the golden tide was found in the western coast of Korea on January and in the northern waters of Jeju Island, Korea on February, and at the same time, another large-scale patch was found in the waters near the mouth of the Yangtze River and Zhejiang Province, China. From the results, the golden tide outbreak occurred that first flowed in west coast of Korea and northern Jeju Island in the winter, and then another outbreak occurred in southern Jeju Island in spring. It was analyzed that the movement trend of the golden tide has changed in recent years that the golden tide presented in the YS and ECS have different origins such as Bohai Bay and near the Yangtze River and Zhejiang Province, China.

How to cite: Son, Y. B. and Choi, J.-K.: Tracing the golden tide outbreak in the Yellow Sea and East China Sea over a 15-year period using multi-satellite sensor data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6085, https://doi.org/10.5194/egusphere-egu23-6085, 2023.

EGU23-6520 | ECS | Orals | GI6.3

Mapping Cambodian Wetlands with Satellite Imagery and Google Earth Engine’s Machine Learning Algorithm.

Vasudha Darbari, Hackney Christopher, Vasilopoulos Grigorios, Forsters Rodney, and Parsons Dan

The wetlands and lakes that make up more than 30% of Cambodia's terrain are home to a diverse range of resources and biodiversity. More than 46% of the population lives and works in these wetlands while 80% of the local population relies on their vital resources for sustenance such as fish, food, water and vegetables. This makes Cambodia one of the nations with the highest reliance on wetland and lake ecosystems in the world. On-going development in the region has boosted the rates of urbanization. Urban expansion has deteriorated wetland ecosystems through land reclamation and infilling projects as well as hydrological and sediment cycle disruptions. It has also increased the demand for mined sand from the Mekong River. Mapping and monitoring the extent and distribution of wetland ecosystems in order to quantify the impact of human activities on these vital areas is critical for maintaining the ecological balance and promoting the sustainable development of an extensively eco-service dependent country such as Cambodia. In this study we combine spaceborne multispectral and radar remote sensing datasets with machine learning classification models and algorithms within the Google Earth Engine to monitor the changes observed in Cambodian wetlands through time. Our classifier is trained by comparing Sentinel 1 Synthetic Aperture Radar data to corresponding multispectral images captured from Landsat. We then use the classifier to monitor wetland extent through time from 1989 to present using merged Landsat 5 and 8 databases. With our maps and areal statistics, we identify the spatio-temporal trends and changes in wetland cover linked to climatic patterns and local anthropogenic influence connected to sand mining from the Mekong River and land infilling. In the last 15 years, about half the country’s wetlands have disappeared, with 15 out of 25 lakes near the capital completely infilled with sand that can be clearly observed with analysis of satellite data.

How to cite: Darbari, V., Christopher, H., Grigorios, V., Rodney, F., and Dan, P.: Mapping Cambodian Wetlands with Satellite Imagery and Google Earth Engine’s Machine Learning Algorithm., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6520, https://doi.org/10.5194/egusphere-egu23-6520, 2023.

EGU23-6548 | ECS | Posters on site | GI6.3 | Highlight

Using LiDAR on a Ground-based Agile Robot to Map Tree Structural Properties

Omar Andres Lopez Camargo, Kasper Johansen, Victor Angulo, Samer Almashharawi, and Matthew McCabe

The widespread use of diameter at breast height (DBH) and tree height attributes as a non-destructive indirect estimation of tree parameters (e.g., above-ground biomass, volume, age, and carbon stock) demands efficient and accurate surveying methods. However, traditional surveys, which are primarily manual, are often time-consuming, inaccurate, inconsistent, and might suffer from observer-bias. This study applies an agile quadruped robot, Spot from Boston Dynamics, and a mounted LiDAR system for mapping and measuring tree height, diameter at breast height (DBH), and tree volume. This project uses the Spot Enhanced Autonomy Payload (EAP) navigation module as the source of LiDAR data. The use of this module has two main advantages. First, Spot EAP's VLP-16 sensor is a low-beam LiDAR that, as demonstrated in previous research, is capable of estimating tree structural parameters while consuming less time and data than robust systems such as Terrestrial Laser Scanning (TLS). Second, using an existing payload as the primary source of data without disabling its default function results in more efficient payload capacity utilization and, as a result, lower energy consumption, in addition to making room for additional payloads. The experiment was conducted for 41 trees (23 Erythrina variegata and 18 Ficus altissima) in a park on the campus of King Abdullah University of Science and Technology (KAUST) in Saudi Arabia. TLS data were used to compute the height and volume reference data, while manual measurements were used to obtain DBH reference data. The robot-derived point cloud generation methodology was based on a multiway registration approach in which a total of 76 scans were acquired from 4 different locations using multiple poses of the robot to overcome the short field of view of the LiDAR sensor. As a result of processing the scans, a point cloud for each of the trees was obtained. The height estimations, which consist of a difference within Z coordinates, obtained a mean absolute error (MAE) and a mean percentage error (MPE) of 6.71 cm and 1.31% respectively. The DBH estimation based on circle-fitting algorithms obtained an MAE and an MPE of 2.55 and 12.99% respectively. The volume estimation obtained a coefficient of determination of 0.93. When compared to the most recent approaches available in the literature, the results for height and volume were satisfactory, yielding higher accuracy than other studies in some cases. The results for DBH estimation were also comparable to those in the literature. The main sources of error were tree occlusion and inclined trees, both of which are solvable by including more scanning locations and increasing the robustness of software estimation. Consequently, the acquisition system is not a barrier to future improvements. This work successfully introduced one of the first methods for using agile robots in high throughput field phenotyping. The use of agile robots addresses some of the major challenges for deploying ground-based robotics in high throughput field phenotyping, allowing for a higher assessment frequency without causing soil compaction and damage, as well as bringing unprecedented adaptation to difficult terrains.

How to cite: Lopez Camargo, O. A., Johansen, K., Angulo, V., Almashharawi, S., and McCabe, M.: Using LiDAR on a Ground-based Agile Robot to Map Tree Structural Properties , EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6548, https://doi.org/10.5194/egusphere-egu23-6548, 2023.

EGU23-6995 | Orals | GI6.3

Detecting hydromorphological structures using an AI-based analysis of high-resolution drone imagery to access physical river habitat development

Felix Dacheneder

The detection of hydromorphological structures gained more attention during the last decades. Many approaches of different scopes, scales and purposes have been developed. They can either be classified as stand-alone methods, like the German River Habitat Survey, which evaluates the hydromorphological integrity on a catchment scale or as methods being part of an ecological assessment, which includes the estimation of hydromorphological characteristics on the scale of respective study sites. The main purposes of detecting hydromorphological structures are to investigate the spatial characteristics and temporal scale of change to collect reliable and comparable data in a sampling setup of an ecological multi habitat sampling. Especially river restoration projects often lack the detection of positive effects on aquatic biota induced by missing or wrong development of physical river habitat structures (PRHS).

Most methods available for determining PRHS are insufficient for this task as they lack sufficient temporal and spatial resolution. Examples thereof include overview methods based on topographic maps and remote sensing. On the other hand, visual assessment methods do not reach the required accuracy and objectiveness or are too general if too few hydromorphological structures are assessed. Therefore, this research proposes the combination of Unmanned Areal Vehicle (UAV) and high-resolution sensors. This combination creates high-resolution imagery or point clouds by using multispectral sensors or Lidar scanner.

In a case study of the river Lippe, the methods for detecting PRHS on Structure from Motion (SfM) high-resolution imagery with deep learning, based classification methods are applied. Results indicate the potential from different deep learning classification approaches to identify physical river habitat structures being able to assess the development over time.

How to cite: Dacheneder, F.: Detecting hydromorphological structures using an AI-based analysis of high-resolution drone imagery to access physical river habitat development, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6995, https://doi.org/10.5194/egusphere-egu23-6995, 2023.

EGU23-7538 | Posters on site | GI6.3

Surface temperature retrieval at mid-infrared band using a combination of low-orbit and geostationary orbit satellite imagery

Kwonho Lee, Heeseob Kim, Seonghun Pyo, and Seunghan Park

Infrared remote sensing technique has been widely used for the characteristics of objects since it
has the advantage of higher atmospheric transmittance than visible wavelengths. However, the
Mid-Wave InfraRed (MWIR) region close to the visible band can be partially affected by solar
radiation, so the solar radiation and attenuation in the atmosphere cause errors in the target
detecting. In this study, an algorithm for retrieval of the mid-infrared surface temperature was
developed by using a combination of the GEO-KOMPSAT-2A (GK-2A) satellite and Landsat data.
Through the comparison with ground observations, it was found that the surface temperatures at
MWIR band retrieved are less than 3K, and a statistically significant level of mutual comparison
was obtained. Therefore, despite the limitations of the MWIR band, the new methodology can be
applied to determine the surface-level temperature through the coupling between the two
different orbit satellites.

Acknowledgement
This research was supported by the Korea Aerospace Research Institute (FR22H00W01) in
2022-2023.

How to cite: Lee, K., Kim, H., Pyo, S., and Park, S.: Surface temperature retrieval at mid-infrared band using a combination of low-orbit and geostationary orbit satellite imagery, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7538, https://doi.org/10.5194/egusphere-egu23-7538, 2023.

EGU23-8857 | ECS | Orals | GI6.3

Identification and remediation-related monitoring of potential toxic elements (PTE) in the hyperaccumulator plant Brassica juncea with hyperspectral imaging.

Friederike Kästner, Theres Küster, Hannes Feilhauer, and Magdalena Sut-Lohmann

Across Europe there are 2.5 million potentially contaminated sites due to natural and anthropogenic activities. In this regard, phytoremediation approaches are need as a cost-effective and ecosystem-friendly technique to rehabilitate soil compared to conventional methods. Hyperspectral imaging provides an ideal method to improve and monitor existing bioremediation methods, using hyperaccumulator plants. In our study, the hyperaccumulator plant Brassica juncea showed a high tolerance to the accumulation of Cu, Zn and Ni. Hyperspectral measurements were conducted with a HySpex VNIR-SWIR hyperspectral sensor (408-2500 nm) in-situ and in the laboratory. To monitor and optimize the process of accumulation with hyperspectral imaging, we calculated different vegetation indices, related to metal-induced plant stress, such as TCARI/OSAVI, Chlorophyll Vegetation Index (CVI), Red-Edge Stress Vegetation Index (RSVI), Normalized Pigments Chlorophyll Index (NPCI), Red-Edge Inflection Point (REIP) and Disease Water Stress Index (DWSI), using various pre-processing steps (raw, smoothed and brightness corrected data). In addition, the relation between the different indices and the measured heavy metal content in the samples were tested with a multivariate technique using Partial Least Squares Regression (PLSR). Our results revealed, even with no pre-processed image data, changes in chlorophyll- and red-egde-related indices with increasing PTE concentration. With hyperspectral imaging we are already able to monitor differences of the PTE accumulation within the hyperaccumulator plant Brassica juncea.

How to cite: Kästner, F., Küster, T., Feilhauer, H., and Sut-Lohmann, M.: Identification and remediation-related monitoring of potential toxic elements (PTE) in the hyperaccumulator plant Brassica juncea with hyperspectral imaging., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8857, https://doi.org/10.5194/egusphere-egu23-8857, 2023.

EGU23-8888 | ECS | Posters on site | GI6.3

Monitoring the environmental conditions in landfill sites: a case study of Fyli - Ano Liosia, Attica Region, Greece

Eirini Efstathiou and Vassilia Karathanassi

Landfills constitute a major environmental issue that needs to be handled, especially when they are located near large urban areas. In landfills, end up most of the city non-hazardous solid waste (mainly household waste), which are not appropriate for recovery/recycling and thus they are disposed in the ground for decomposition process. Monitoring of such sites is significantly important, due to the fact that the decomposition process - which includes the release of hot gases - is harmful to the environment and to the human health. The increase of Land Surface Temperature (LST) in landfill sites and the methane gas emissions, which contribute to the greenhouse effect, can be monitored using remote sensing methods and techniques. This type of monitoring is very important for safeguarding the surrounding environment, especially in environmentally sensitive areas, as are those located close to densely populated areas, and therefore, many studies have been carried out focusing on the monitoring of the environmental impacts of landfills through remote sensing. In relevance with previous literature, the current study aims at monitoring the environmental impact of the active landfill site of Fyli – Ano Liosia, Attica, Greece. For the needs of the study, time series of Land Surface Temperature (LST) have been processed as extracted from Landsat 8-9 satellite imagery. The analyzed time period is from January 2021 to December 2022. LST data have been extracted from two areas within the landfill, one in the active landfill area and the second one in an area that has been rehabilitated and is no longer active. Furthermore, we selected to study LST data from a bare soil area which is located at a short distance from the landfill in order to find temperature deviation caused by the decomposition processes. The land surface temperatures inside the landfill have been compared with those of the bare soil as well as with the air temperature, which is provided by the weather station of Ano Liosia of METEO (infrastructure of National Observatory of Athens for weather forecasting). It has been observed that the LST in the active area of the landfill is higher by 1°C-2°C compared to that in the inactive area of the landfill, and by 2°C-3°C compared to the bare soil LST. A reversal of this phenomenon has been observed during the snowy winter months due to different snowmelt rates and possibly due to a slowdown of the decomposition process. The air temperature was found to be significantly lower than the LST, as expected.

How to cite: Efstathiou, E. and Karathanassi, V.: Monitoring the environmental conditions in landfill sites: a case study of Fyli - Ano Liosia, Attica Region, Greece, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8888, https://doi.org/10.5194/egusphere-egu23-8888, 2023.

EGU23-9489 | ECS | Posters on site | GI6.3 | Highlight

Resolution-enhanced Hyperspectral EnMAP data: CubeSat-based high resolution data fusion approach

Victor Angulo, Kasper Johansen, Jorge Rodriguez, Omar Lopez, Jamal Elfarkh, and Matthew McCabe

Hyperspectral (HS) images obtained from space are useful for monitoring different natural phenomena on regional to global scales. The Environmental Mapping and Analysis Program (EnMAP) is a satellite recently launched by Germany to monitor the environment and explore the capabilities of hyperspectral sensors in the 420 and 2450 nm range of the spectrum. However, the data captured by the EnMAP mission have a ground sampling distance (GSD) of 30 m. This limits the use of the data for some applications that require higher spatial resolution (<10 m). This study examines the potential for improving the resolution of hyperspectral data using high resolution multispectral (MS) data obtained by Cubesats. Specifically, this work uses the data captured by the PlanetScope constellation, which has more than 150 CubeSats in low Earth orbit, with a high spatial and temporal resolution. The approach adopted leverages (1) the spectral capability of the hyperspectral EnMAP sensor, with a bandwidth of 6.5 nm in the visible and near infrared (VNIR) range (420–1000 nm) and 10 nm in the SWIR range (900–2450 nm), and (2) the spatial capability of the multispectral PlanetScope data, with a GSD of 3 meters, to enable significant spatial improvements due to its high spatial resolution. The main components of this work include: (i) area of interest clipping (ii) data co-registration, (iii) HS-MS data fusion, and (iv) quality assessments using the Jointly Spectral and Spatial Quality Index (QNR). In this study, a 2 km x 2 km area of interest was selected in the Malaucene region of France, where six state-of-the-art HS-MS fusion methods were evaluated: (1) fast multi-band image fusion algorithm (FUSE), (2) coupled nonnegative matrix factorization (CNMF), (3) smoothing filtered-based intensity modulation (SFIMHS), (4) maximum a posteriori stochastic mixing model (MAPSMM), (5) Hyperspectral Superresolution (HySure), and (6) generalized laplacian pyramid hypersharpening (GLPHS). Quality assessments of the enhanced data showed that high spectral and spatial fidelity are maintained, with the best performing fusion method being FUSE with a QNR of 0.625 followed by the MAPSMM method with a QNR of 0.604. Overall, this study advocates the benefits associated with the fusion of hyperspectral and multispectral data to obtain enhanced EnMAP data at 3 m GSD.

How to cite: Angulo, V., Johansen, K., Rodriguez, J., Lopez, O., Elfarkh, J., and McCabe, M.: Resolution-enhanced Hyperspectral EnMAP data: CubeSat-based high resolution data fusion approach, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9489, https://doi.org/10.5194/egusphere-egu23-9489, 2023.

EGU23-10320 | ECS | Posters on site | GI6.3

Sensitivity analysis and discontinuity removal of 6SV LUT-based surface reflectance for each channel: based on GEO-KOMPSAT-2A

Daeseong Jung, Kyung-soo Han, Noh-hun Seong, Suyoung Sim, Jongho Woo, Nayeon Kim, and Sungwon Choi

To monitor the surface based on Earth observation optical satellites, accurate atmospheric correction of satellite images is required. Surface reflectance is calculated using a look-up table (LUT) based on a radiative transfer model. In addition, atmospheric gas components and geometric information of solar and satellite observations used in LUT construction are applied to each channel at equal intervals. However, the atmospheric gas components are sensitive to the atmospheric effect in a specific wavelength range of the satellite sensor. The higher the geometric information appears in the satellite observation area, the greater the variability of the atmospheric effect occurs because the moving distance of light increases. Because of this, LUT-based atmospheric correction at equal intervals generates discontinuities in surface reflectance in satellite images. In this study, to improve the quality of the surface reflectance applied with atmospheric correction, a Second Simulation of a Satellite Signal in the Solar Spectrum Vector (6SV) radiation transfer model was used to analyze the sensitivity of the surface reflectance for each channel according to the GEO-KOMPSAT-2A-based atmospheric gas component and the geometric information of the solar and satellite observations. After figuring out the variability of surface reflectance for each channel according to the intervals of variables used in LUT construction, an error analysis of surface reflectance was performed for the optimal LUT interval considering the interpolation technique. In the future, it is considered that the results of this study can be used to identify LUT-based surface reflectance characteristics for removing discontinuities in surface reflectance, including increasing the utilization of geostationary satellite images.

How to cite: Jung, D., Han, K., Seong, N., Sim, S., Woo, J., Kim, N., and Choi, S.: Sensitivity analysis and discontinuity removal of 6SV LUT-based surface reflectance for each channel: based on GEO-KOMPSAT-2A, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10320, https://doi.org/10.5194/egusphere-egu23-10320, 2023.

EGU23-10586 | Posters on site | GI6.3

Application cases of remote sensing-integrated crop model to simulate and predict crop yield with satellite images

Seungtaek Jeong, Jong-min Yeom, Jonghan Ko, Daewon Chung, and Sun-Gu Lee

The remote sensing-integrated crop model (RSCM) was designed to simulate crop growth processes and yield using remote sensing data. The RSCM is based on the radiation use efficiency (RUE) model and employs a within-season calibration procedure recalibrating the daily crop leaf area index (LAI) using satellite images. And it has functions to calculate daily biomass, evapotranspiration (ET), gross primary productivity (GPP), and net primary productivity from the LAI in addition to crop yield. In previous studies, the essential crop growth parameters required in the model, such as RUE, light extinction coefficient, specific leaf area, base temperature, etc., were determined through field experiments. And its performances were validated using various remote sensing data, including proximity sensing data, drone images, and satellite images. Among them, this study presented the application results with satellite images in the RSCM. The target crop is rice (Oryza Sativa), one of the world's major crops, and the study areas range from South Korea to Northeast Asia. Satellite images and meteorological data were used differently depending on the study sites. The types of satellite images used in this study are the RapidEye, the Moderate Resolution Imaging Spectroradiometer (MODIS) of the Terra/Aqua satellite, and the Geostationary Ocean Color Imager (GOCI) and the Meteorological Imager (MI) of Communication, Ocean and the Meteorological (COMS) satellite. And gridded data for air temperature and solar radiation was acquired from the Korea Local Analysis and Prediction System (KLAPS) and the European Centre for Medium-Range Weather Forecasts (ECMFW). The primary application of the RSMC is to simulate rice yield, but some results showed crop growth factors such as biomass, LAI, GPP, and ET. In addition, the most recent study performed the early prediction of crop yield by combining deep learning with crop models. Through this study, it is possible to know the future utilization of the RSCM model in the agriculture and satellite application fields.

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT) (RS-2022-00165154, "Development of Application Support System for Satellite Information Big Data").

How to cite: Jeong, S., Yeom, J., Ko, J., Chung, D., and Lee, S.-G.: Application cases of remote sensing-integrated crop model to simulate and predict crop yield with satellite images, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10586, https://doi.org/10.5194/egusphere-egu23-10586, 2023.

EGU23-11414 | ECS | Orals | GI6.3

Temporal variations of United Arab Emirates coastline from 1991 to 2021

Justine Sarrau and Abdelgadir Abuelgasim

In the context of global sea level rising, coasts are directly impacted. The retreat to coastlines and submersion of anthropic installations are among the major impacts. It is thus imperative to continuously monitor the coastlines status and devise the means and techniques to effectively assess their status. The United Arab Emirates (UAE) for example is a country which has a long sandy coastline. In this research, an algorithm was developed that makes use of remote sensing temporal data to assess the variability of the coastline in the UAE. The algorithm is used to automatically extract the whole coastline between 1991 and 2021 from Landsat 5 and 8 satellite images. They were selected for 1991, 2001, 2013 and 2021 because of the availability of data, and the significant changes that have been done in coastal areas due to urban development during this period.

Only the Landsat spectral bands of green and near-infrared were utilized to calculate the spectral index of detection of the coast DDWI (Direct Difference Water Index). It is the first step of the algorithm developed. Then is used an automatic threshold Otsu to differentiate the land from water. The result is filled to remove the main artifacts and a canny edge detector is used to detect the coastline. At the end of the algorithm, the result is georeferenced because it lost it during the process. The georeferenced layer is polygonised so that the remaining artifacts are easier to remove. Then, a mask layer was created including boats, clouds, etc… and it is removed from the polygonised layer to get the final extracted coastline.

The preliminary findings of this study show that the sandbanks have increased during the period of the study along the Arabian Gulf waters, suggesting that the coastline is retreating. The results showed a development of the sandbanks towards the Arabian Gulf in several places along the northern coastline but also their general retreat on the north-western one. This can be explained by the sediment settlement or the backfills that have been done to create new islands especially around Abu Dhabi city and Dubai. The creation of mangroves plantations or port infrastructures in the same place has completely changed the coastline layout of the UAE.

On the other side of the UAE, along the sea of Oman, the sandbanks have retreated, suggesting either soil erosion by water currents or advancement of the coastline. The results show no significant change at all and no sandbanks. The only changes observed are linked to the anthropic modification of the coast. While the coastline did not change, the developed algorithm detected scattered sandbanks as the coastline. This confusion likely comes from the similar reflectance of sandbanks in shallow water with the sand of the coast. A further improvement for the developed algorithm will be pursued in the future to reduce such confusions.
This methodology is applicable worldwide, but it is necessary to monitor the results for sandy areas such as the Middle East.

How to cite: Sarrau, J. and Abuelgasim, A.: Temporal variations of United Arab Emirates coastline from 1991 to 2021, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11414, https://doi.org/10.5194/egusphere-egu23-11414, 2023.

EGU23-12111 | Orals | GI6.3

Multi-source UAV remote sensing and AI for crop growth monitoring

Zhigang Sun, Wanxue Zhu, Ehsan Eyshi Rezaei, Jinbang Peng, Danyang Yu, and Stefan Siebert

Accurate and in-time monitoring of cropping systems is critical to precision farming in order to facilitate decision-making for agronomic management and enhancing crop yield under changing climate. In this study, multi-source unmanned aerial vehicle (UAV) remote sensing observations were conducted at several key growing stages of crops at a standard wheat-maize cropping system field trials in the North China Plain from 2018 to 2020. Crop leaf area index, above-ground biomass, chlorophyll content, grain yield, and plant density were estimated using multi-source UAV remote sensing observations (including RGB, multi/hyperspectral, LiDAR, and thermal sensors) processed by machine/deep learning approaches.

In this study, we will give a comprehensive research introduction focusing on how to improve the estimation accuracy of the above crop growth variables via UAV remote sensing and machine/deep learning approaches, including three aspects:

(1) Data source and fusion, including the integration of multi-source UAV information for comprehensive maize growth monitoring, comparison of UAV-based point clouds with different densities for crop biomass estimation, and crop chlorophyll content estimation using multi-scale hyperspectral information.

(2) Optimization of UAV observation management: we will answer when is the most relevant phenological stage for maize yield estimation via high-frequent UAV observations; investigate extrapolation artefacts, validate the suitability and discuss the uncertainty of the UAV-based strategies for 'model calibration at a small site while applying these models at a large extent' for crop monitoring.

(3) Modeling improvement will give two cases to introduce improving crop biomass estimation accuracyand realize the plant density counting during the vigorous growing period employing deep learning.

How to cite: Sun, Z., Zhu, W., Eyshi Rezaei, E., Peng, J., Yu, D., and Siebert, S.: Multi-source UAV remote sensing and AI for crop growth monitoring, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12111, https://doi.org/10.5194/egusphere-egu23-12111, 2023.

EGU23-12310 | Orals | GI6.3 | Highlight

Integration between space- and ground-based observations in areas prone to volcanic hazard: the experience of Mt. Etna Supersite

Giuseppe Puglisi, Alessandro Bonforte, Maria Fabriza Buongiorno, Lucia Cacciola, Francesco Guglielmino, Gaetana Ganci, Massimo Musacchio, Simona Scollo, Danilo Reitano, Malvina Silvestri, and Letizia Spampinato

The Geohazard Supersites and Natural Laboratories (GSNL) is an initiative of the Group of Earth Observation (GEO) that has started in 2007 with the Frascati declaration, in which the GeoHazards Community of Practice recommended to: “... stimulate international and intergovernmental effort to monitor and study selected reference (geologic hazards) sites, by establishing open access to relevant datasets according to GEO principles, to foster collaboration between various partners and end users”. Since the beginning the main idea has been the improvement of the hazard assessment by combining space- and ground-based datasets provided by the Space Agencies and the research institutions managing the in-situ observation systems, respectively.

According to the definition of Supersite, since the early stage of the GSLN initiative, Mt. Etna has been identified as one of the Supersites due to its almost continuous eruptive activity, the great amount of satellite and in-situ data available, and the advanced in-situ multi-parametric observing systems. Officially, Mt. Etna is a Permanent Supersite since 2014. The Space Agencies provide quotas of SAR and high-spatial resolution optical multispectral satellite data and INGV offers geophysical, geochemical, and volcanological data. The data are accessible via an open access platform implemented in the framework of the EC FP7 MED-SUV project, and is going to be integrated in the EPOS research infrastructure.

During the past few decades, Mt. Etna has erupted almost every year offering the optimal conditions to apply the Supersite concept; thus here we report some relevant examples of the integrated use of the space-and ground-based data applied to Mt. Etna’s eruptions, highlighting how such complementarity improved the monitoring of the eruptive events and the assessment of the associated hazards.

How to cite: Puglisi, G., Bonforte, A., Buongiorno, M. F., Cacciola, L., Guglielmino, F., Ganci, G., Musacchio, M., Scollo, S., Reitano, D., Silvestri, M., and Spampinato, L.: Integration between space- and ground-based observations in areas prone to volcanic hazard: the experience of Mt. Etna Supersite, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12310, https://doi.org/10.5194/egusphere-egu23-12310, 2023.

EGU23-12364 | Posters on site | GI6.3

Mapping mining environmental impacts using PRISMA hyperspectral imagery: Acid mine drainage (AMD) example

Veronika Kopackova-Strnadova

Acid mine drainage (AMD) is considered as one of the main factors causing water pollution in regions with historic or current mining activities. Its generation, release, mobility, and attenuation involves complex processes governed by a combination of physical, chemical, and biological factors. Clearly this phenomenon is highly dynamic depending on other external factors such as precipitations and ground water table fluctuations and conventional monitoring is time and resource-demanding. Recent research studies proved that imaging spectroscopy represents an alternative to conventional methods and an efficient way to characterize mines and assess the potential for AMD discharge while focusing on mapping those minerals serving as indicators of sub-aerial oxidation of pyrite (‘hot spots’) and the subsequent formation of AMD. In this study a potential of new PRISMA hyperspectral satellite sensor for multi-temporal AMD mappng was evaluated. The PRISMA AMD mineral mapping results were compared with existing ground truth data and other validated AMD maps derived using aerial high-resolution hyperspectral imaging data (e.g., CASI/SASI). To conclude, a spectral and spatial resolution of the PRISMA satellite data is sufficient to map this phenomenon at multi-temporal scale and PRISMA data has a potential to be operationally used in remediation projects and other environmental applications.

How to cite: Kopackova-Strnadova, V.: Mapping mining environmental impacts using PRISMA hyperspectral imagery: Acid mine drainage (AMD) example, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12364, https://doi.org/10.5194/egusphere-egu23-12364, 2023.

EGU23-12461 | Posters on site | GI6.3

Machine learning based two-step urban tree carbon storage estimation fusing airborne LiDAR, and Sentinel-2

Yeonsu Lee, Bokyung Son, and Jungho Im

Urban trees are important carbon sink in human settlements by absorbing carbon dioxide and storing them as biomass. As urban areas continue to expand, quantification of carbon storage (CS) in human settlements is becoming important. Usually, urban tree CS is extrapolated using total tree area statistics and carbon stocks per unit area. However, since urban trees show large variability due to diverse growing conditions, additional information such as vegetation vitality or three-dimensional structures should be considered in CS estimation. This study suggests a new two-step approach to estimate urban tree CS using forest tree carbon stocks and then correcting it to human settlements via machine learning (ML) regression models and remote sensing data. First, urban tree CS was estimated using a high-resolution urban tree canopy cover map which classified by deep-learning approach and forest tree carbon stocks which were calculated using merchantable growing stocks and biomass expansion factor (Step 1 CS). Second, urban tree CS was estimated via ML models using Step 1 CS, Sentinel-2 images, and airborne light detection and ranging (LiDAR) measurement as independent variables. As dependent variable, the field-measured CS values calculated using allometric equations and field-measured diameter at breast height using terrestrial LiDAR were utilized. Step 2 CS using random forest showed the best performance with a correlation coefficient of 0.90 and a root-mean-squared-error of 0.48. Tree height and normalized difference vegetation index appeared as important variables in estimating urban tree CS. Suggested model can estimate urban tree CS more sophisticatedly and spatially explicitly. The output, high-resolution urban tree CS map, can be used in urban planning to achieve carbon neutrality and pleasant urban environment.

How to cite: Lee, Y., Son, B., and Im, J.: Machine learning based two-step urban tree carbon storage estimation fusing airborne LiDAR, and Sentinel-2, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12461, https://doi.org/10.5194/egusphere-egu23-12461, 2023.

EGU23-12505 | ECS | Orals | GI6.3

Investigating the relationships between the leaf area index and forest functions of dryland conifer forests along an aridity gradient using VENµS and Sentinel-2 satellites

Moshe (Vladislav) Dubinin, Yagil Osem, Dan Yakir, and Tarin Paz-Kagan

Dryland forests are highly climate-sensitive are facing more frequent droughts and, consequently, increasing tree mortality, extreme wildfire events, and outbreaks of forest insects and pathogens. These changes, associated with climate change, are leading to biodiversity loss and the deterioration of related ecosystem services. Understanding the relationships between forest structure and function is essential for managing dryland forests to adapt to these changes. We studied the structure-function relationships in four dryland conifer forests distributed along a semiarid to sub-humid climatic aridity gradient. Forest structure was represented by leaf area index (LAI) and function by gross primary productivity (GPP), evapotranspiration (ET), and the derived efficiencies of water use (WUE= GPP/ET) and leaf area (LAE = GPP/LAI). The water and carbon fluxes at the ecosystem level were estimated by an empirical approach in which regression models were developed to relate multiple spectral data (VIs) derived from VENμS and Sentinel-2A satellites, combined with meteorological data, to local eddy covariance measurements from flux tower records available at three of the four study sites. The red-edge-based MERIS Terrestrial Chlorophyll Index (MTCI) from VENμS and Sentinel-2A showed strong correlations to flux tower GPP and ET measurements (R²_cal>0.91, R²_val>0.84). Using our approach, we showed that as LAI decreased with decreasing AI (dryer conditions), estimated GPP and ET decreased (R²>0.8 to LAI), while WUE (R²=0.68 to LAI) and LAE increased with decreasing AI. We propose that the higher WUE and LAE reflect an increased proportion of sun vs. shade leaves as LAI decreases. The results demonstrate the importance of high-resolution spectral and spatial data in low-density dry forests and the intricate structure-function interactions in the forests’ response to drying conditions.

How to cite: Dubinin, M. (., Osem, Y., Yakir, D., and Paz-Kagan, T.: Investigating the relationships between the leaf area index and forest functions of dryland conifer forests along an aridity gradient using VENµS and Sentinel-2 satellites, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12505, https://doi.org/10.5194/egusphere-egu23-12505, 2023.

EGU23-12720 | ECS | Orals | GI6.3

Volcanic cloud detection and retrieval by micro-millimetre-waves and thermal-infrared satellite observations

Francesco Romeo, Luigi Mereu, Stefano Corradini, Luca Merucci, and Simona Scollo

The characterization of the eruption source parameters (EPS) of explosive eruptions is of vital importance to prevent damages, mitigate environmental impact and reduce aviation risks. We consider highly explosive eruptions with a Volcanic Explosive Index (VEI) greater than 3. During these eruptions, a great number of volcanic particles are ejected into the atmosphere where they can remain suspended for several weeks. Satellite passive sensors can be adopted to monitor volcanoes due to their high spatial and temporal resolution.

In this work we combine the Microwave (MW) and Millimetre-wave (MMW) observations with Thermal-InfraRed (TIR) radiometric data from Low Earth Orbit (LEO) satellites to have a complete characterization of the volcanic clouds. MW-MMW passive sensors are adopted to detect larger volcanic particles (i.e. size bigger than 20 µm) by working at lower frequencies. TIR observations are employed to study smaller particles due to the sensor settings which work at smaller wavelengths. We describe new physical-statistical methods together with machine learning techniques aiming at detecting and retrieving volcanic clouds masses of 2015 Calbuco, 2014 Kelud as well as other eruptions having high explosive activities worldwide. Concerning the detection, we compare the well-known split-window methods with a machine learning algorithm named Random Forest (RF). This work highlights how the machine learning model is suitable to automatically identify tephra contaminated pixels by combining different spectral information (i.e. MW-MMW and TIR) coming from different satellite platforms. Indeed, we used data coming from: Advanced Technology Microwave Sounder (ATMS) and Visible Infrared Imaging Radiometer Suite (VIIRS) sensors on board the Suomi-NPP LEO satellite; Microwave Humidity Sounder (MHS) and Advanced Very High Resolution Radiometer (AVHRR) sensors on board the Metop series. In terms of retrieval, the new developed Radiative Transfer Model Algorithm (RTM_A) is designed to estimate the total columnar content (TCC) and in turn the mass, for both MW-MMW and TIR observations. The synthetic BTs (simulated by RTM_A) are linked with the observed BTs to retrieve the volcanic clouds features. In this respect, two minimization techniques, the Maximum Likelihood Estimation (MLE) and the Neural Network (NN) architecture, are also compared and discussed. Results show a good comparison of the mass obtained using the MLE and NN methods for all the analysed bands but also with previous studies on the deposit as well as other validated satellite retrieval methods.

In conclusion, this work shows how the machine learning model can be an effective tool for volcanic cloud detection and how the synergic use of the TIR and MW-MMW observations can give more accurate estimates of the near source volcanic cloud.

How to cite: Romeo, F., Mereu, L., Corradini, S., Merucci, L., and Scollo, S.: Volcanic cloud detection and retrieval by micro-millimetre-waves and thermal-infrared satellite observations, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12720, https://doi.org/10.5194/egusphere-egu23-12720, 2023.

EGU23-13477 | ECS | Posters on site | GI6.3

Assessing the atmospheric correction algorithms for improving the retrieval data accuracy in the remote sensing technique

Mir Talas Mahammad Diganta, Md Galal Uddin, and Agnieszka I. Olbert

For the purposes of cost-effective and rapid surface water quality monitoring, the utilization of the cutting-edge satellite remote sensing (RS) technique has increased over the years. Recently, several studies have revealed that the RS technique severely suffers from particles present in the atmosphere, especially from aerosol particles. This interference significantly influences the quality of the information extracted from remote sensing measurements and produces much more uncertainty in retrieving optically active water quality indicators (e.g., chlorophyll-a, coloured dissolved organic matter, total suspended matter) from optically complex water bodies. Therefore, it is required to minimize the uncertainty within the remotely sensed data by reducing the impact of atmospheric interference through the atmospheric correction (AC) process. Currently, a series of algorithms have been utilized in the literature for treating the AC in the RS technique, among which ACIX-Aqua, ACOLITE, BAC, C2RCC, FLAASH, iCOR, l2gen, LaSRC, POLYMER, GRS, Sen2Cor, and 6SV are widely used. Since the development of the AC algorithms, its applications have increased in handling of big data, like as remote sensing data. Recently, several studies have revealed that the existing algorithms have produced a considerable uncertainty in the retrieval data due to the architectural complexity of algorithms. Although, the application of cutting-edge machine learning and artificial intelligence techniques is increasing for atmospheric correction process. Therefore, the aim of the research is to develop an efficient algorithm utilizing the publicly available AC algorithms and incorporating machine learning and artificial intelligence approaches in order to reduce atmospheric interference from the RS data. The results of the research could be helpful for retrieving various optically active water quality indicators most efficiently in terms of reducing the uncertainty in monitoring water quality.

Keywords: surface water quality; remote sensing; atmospheric correction, artificial intelligence; optically active water quality indicators.

How to cite: Diganta, M. T. M., Uddin, M. G., and Olbert, A. I.: Assessing the atmospheric correction algorithms for improving the retrieval data accuracy in the remote sensing technique, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13477, https://doi.org/10.5194/egusphere-egu23-13477, 2023.

EGU23-13492 | ECS | Posters on site | GI6.3

Automated mapping of grain size distributions from UAV imagery using the CNN-based GRAInet model

Theodora Lendzioch, Jakub Langhammer, and Veethahavya Kootanoor Sheshadrivasan

The grain size distribution of gravel riverbed material is an essential parameter to estimate the sediment transportation, groundwater-river flow interaction, river ecosystem, and fluvial geomorphology. Conventional and present methods of obtaining grain size distribution analysis of more extensive areas are time-consuming and remain challenging in effectively modeling sediment load. On this account, this paper appraised the role of employing the end-to-end data-driven GRAINet approach, a convolutional neural network (CNN) application, to predict and map the grain size distribution at particular locations over an entire gravel bar based on georeferenced drone-based orthoimagery. We conducted multiple drone surveys after post-flood events in the Javoří Brook Šumava National Park (Šumava NP) in Czechia over a small unregulated montane stream with an exposed gravel bar and frequently changed fluvial dynamics. The GRAINet model performances between the predicted mean diameter (dm) and ground truth diameter dm (human performance) produce the result of different loss functions, i.e., the mean absolute errors (MAEs), the mean squared errors (MSEs), and the root-mean-square errors (RMSEs). Corresponding averages of MAEs varied between 3 cm to 4.8 cm with standard deviations (STDs) of 1.7 cm and 3.8 cm, respectively. The averages of MSE ranged between 13 cm to 14.5 cm with STDs of 12.7 cm and 12.8 cm, and RMSE of 3.2 cm to 5.6 cm with STDs of 1.6 cm and 4.6 cm, respectively. With high to moderate accuracies and lower computational costs than other deep learning approaches, the tested ensemble model shows that the integration of UAV remote sensing and machine learning (ML) provides a promising tool to help make decisions using timely mapped high-resolution grain size maps without access to direct object counts or locations.

How to cite: Lendzioch, T., Langhammer, J., and Kootanoor Sheshadrivasan, V.: Automated mapping of grain size distributions from UAV imagery using the CNN-based GRAInet model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13492, https://doi.org/10.5194/egusphere-egu23-13492, 2023.

EGU23-14315 | ECS | Orals | GI6.3 | Highlight

European Ground Motion Service: production status and validation

Lorenzo Solari, Joanna Balasis-Levinsen, and Henrik Steen Andersen

The European Ground Motion Service (EGMS) allows a wide spectrum of users to access ground motion data over 30 European countries for free, under the Copernicus data policy. The EGMS aims to serve for various applications, of which geohazards are probably the primary target. Also, the Service establishes a baseline for studies dedicated to localised deformation affecting buildings and infrastructure in general.

The EGMS is the result of a massive computational effort to process thousands of Sentinel-1 images and derive three levels of products: (a) basic, i.e. line of sight (LOS) velocity maps in ascending and descending orbits referred to a local reference point; (b) calibrated, i.e. LOS velocity maps calibrated with a geodetic reference network and (c) ortho, i.e. components of motion (horizontal and vertical) anchored to the reference geodetic network. The EGMS is implemented under the responsibility of the European Environment Agency in the frame of the Copernicus Programme.

The EGMS baseline (2016-2020) and the first annual update (2016-2021) were made available to users in the summer of 2022 and in the first quarter of 2023, respectively. The EGMS products are displayed in the EGMS Explorer (https://egms.land.copernicus.eu/), where users can investigate the data in a 3D web interface, explore the temporal behaviour of ground motion through time series and download one or multiple data tiles. External web map services can be imported in the EGMS Explorer to ease the interpretation of the interferometric measurements.

The strategy to update satellite interferometric time series is a hot topic for wide area processing services. So that, one of the goals of this presentation is to show the EGMS update strategy, which should guarantee the best trade-off between the identification of new coherent targets and motion areas and the continuity of the Service in terms of technical implementation and noise level.

The expected wide usage of the EGMS products required to setup a validation system that has two goals: verify the usability of the data for the expected range of applications and assess the quality of the products with respect to service requirements. Validation is based on seven activities performed in sixteen different countries and in several validation sites, which are representative for thematic applications (e.g. mining-induced ground motion) in different environments of Europe. To guarantee reproducibility of results, the validation data (e.g. levelling or corner reflectors time series, landslide databases) will be made available to users according to licensing conditions. The results of the validation exercise will be available to users in Q3 2023.

How to cite: Solari, L., Balasis-Levinsen, J., and Andersen, H. S.: European Ground Motion Service: production status and validation, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14315, https://doi.org/10.5194/egusphere-egu23-14315, 2023.

EGU23-14746 | ECS | Orals | GI6.3

i-φ-MaLe: a novel AI-phasor based method for a fast and accurate retrieval of multiple Solar-Induced Fluorescence metrics and biophysical parameters

Riccardo Scodellaro, Ilaria Cesana, Laura D'Alfonso, Margaux Bouzin, Maddalena Collini, Giuseppe Chirico, Roberto Colombo, Franco Miglietta, Marco Celesti, Dirk Schuettemeyer, Sergio Cogliati, and Laura Sironi

The accurate retrieval of Solar-Induced chlorophyll Fluorescence (SIF) is a pivotal target for Earth Observation since SIF can be easily monitored through optical remote sensing and provides unique information concerning the vegetation health status. Here, we propose i-φ-MaLe (metti il nome per esteso), a novel algorithm, which couples the Fourier analysis with a supervised machine learning-based procedure trained with the atmosphere-canopy radiative transfer (RT) SCOPE model. i-φ-MaLe is the first method able to simultaneously retrieve, from the vegetation reflectance spectra, the Top Of Canopy SIF spectrum, the SIF spectrum corrected for leaf/canopy reabsorption (i.e. at photosystem level), the quantum efficiency (Fqe) and three canopy-related biophysical parameters (Leaf Area Index - LAI, Chlorophyll content - Cab and APAR) in few milliseconds. Validation procedures, based on the analysis of RT simulations, demonstrated that i-φ-MaLe, in experimental conditions (signal to noise ratio – SNR >= 500), estimates each biophysical parameter and SIF spectrum with a relative root mean square error (RRMSE) lower than 5%. In order to investigate the seasonal and daily dynamics of SIF, LAI, Cab, Fqe and APAR, the method has been also applied to field experimental data collected in the context of the AtmoFLEX and FLEXSense ESA campaigns, both at top-of-canopy (TOC) and tower (~100 meters) levels. Concerning the TOC scenario, the retrieved annual dynamic for SIF spectra has been compared with the results obtained by inversion-based methods, showing a good consistency amongthe different approaches (RRMSE ~ 10%). Moreover, SIF daily and annual dynamics, retrieved by excluding the oxygen spectral bands affected by the atmospheric reabsorption, have been investigated for high tower measurements. . In this context, i-φ-MaLe provided promising results that can integrate and possibly overcome complex and computationally expensive atmospheric compensation techniques actually needed to retrieve fluorescence from oxygen absorptions bands. This study demonstrates a promising potential to exploit ground and tower spectral measurements with advanced processing algorithms, for improving our understanding on the link between canopy structure and physiological functioning of plants. Moreover, i-φ-MaLe can be straightforwardly employed to process reflectance spectra and open new perspectives in fluorescence retrieval at different scales.

How to cite: Scodellaro, R., Cesana, I., D'Alfonso, L., Bouzin, M., Collini, M., Chirico, G., Colombo, R., Miglietta, F., Celesti, M., Schuettemeyer, D., Cogliati, S., and Sironi, L.: i-φ-MaLe: a novel AI-phasor based method for a fast and accurate retrieval of multiple Solar-Induced Fluorescence metrics and biophysical parameters, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14746, https://doi.org/10.5194/egusphere-egu23-14746, 2023.

EGU23-14908 | Orals | GI6.3

New method for retrieval surface UV albedo from Lidar Surface Returns (LSR) of Aeolus

Lev Labzovskii, Gerd-Jan van Zadelhoff, Gijsbert Tilstra, Jos De Kloe, and David Donovan

Here, we report results paving the way toward a new method for retrieving surface albedo at 355 nm from lidar surface returns (LSR) of Aeolus. We found that averaged monthly LSR estimates at 2.5 x 2.5 grid clearly varied depending on the land type with the signal strength descending as follows: snow, arid areas, vegetation, water surfaces. Most importantly, given the difference in the instrumental setup, Aeolus LSR exhibited unexpectedly high agreement with Lambertian Equivalent Reflectance from TROPOMI and GOME-2 with correlation coefficients (r) of ~0.87 at global scales for median estimates at the study period (r = 0.91 for TROPOMI-GOME-2) and regional estimates for 37 selected areas (r > 0.90) where the agreement is driven by land surfaces with lower agreement over water due to inherently different physics of Aeolus LSR. Aeolus LSR showed superior sensitivity to the change of land type from vegetation to arid, compared to GOME-2 or TROPOMI as indicated by the highest negative agreement between Aeolus LSR, compared to GOME-2 or TROPOMI. We anticipate that our results will lay the foundation for the multiyear surface UV albedo climatology during the entire Aeolus lifetime.

How to cite: Labzovskii, L., van Zadelhoff, G.-J., Tilstra, G., De Kloe, J., and Donovan, D.: New method for retrieval surface UV albedo from Lidar Surface Returns (LSR) of Aeolus, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14908, https://doi.org/10.5194/egusphere-egu23-14908, 2023.

EGU23-14979 | ECS | Orals | GI6.3

Coastal Aquaculture mapping using a novel combination model of GF-3 Fully Polarimetric SAR Imagery: A case study of Yancheng coastal wetland, China

Juanjuan Yu, Xiufeng He, Mahdi Motagh, Peng Yang, and Jia Xu

Coastal aquaculture has become one of the main sources of animal protein and plays an important role in food and nutrition supplies and security around the world. Accurately mapping aquaculture areas is the basis for its sustainable management and use, and provides important support to policy development and implementation at regional, national, and global levels. Considering the concentrated and densely distributed characteristics of aquacultures, it is difficult to distinguish the dikes between aquacultures and identify small-scale aquacultures using medium-and low-resolution SAR images. GaoFen-3 (GF-3) is the first launched full-polarimetric C-band SAR satellite of China at metre-level resolution. This study aims to use a novel combination model to extract coastal aquacultures in the Yancheng coastal wetland, China on the basis of GF-3 Fully Polarimetric SAR Imagery. Polarimetric decomposition algorithms were applied to extract polarimetric scattering features and feature optimization was applied based on the separability index. To separate adjacent and even adhering ponds into individual aquaculture objects, we proposed a novel model that integrated two UNet++ subnetworks with the marker-controlled watershed (MCW) segmentation strategy to obtain more refined coastal aquaculture results. The accuracy assessment results demonstrated a considerable performance, with F1 greater than 95%, IoU greater than 90%, and insF1 higher than 85%. The experimental results indicate that the proposed algorithm achieved fairly high accuracy in aquaculture extraction and can effectively improve the boundary quality of individual aquacultures.

How to cite: Yu, J., He, X., Motagh, M., Yang, P., and Xu, J.: Coastal Aquaculture mapping using a novel combination model of GF-3 Fully Polarimetric SAR Imagery: A case study of Yancheng coastal wetland, China, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14979, https://doi.org/10.5194/egusphere-egu23-14979, 2023.

EGU23-15311 | Orals | GI6.3

Timely mapping and quantification of volcanological parameters: the 2021-2022 Etna lava flows

Cristina Proietti, Emanuela De Beni, Massimo Cantarero, and Tullio Ricci

The 2021 eruptive activity at Mt Etna was characterized by 57 paroxysmal events at the South-East Crater, the most active among its four summit craters. These episodes of Strombolian activity and high lava fountains fed lava flows towards East, South, and South-West and caused ashfall in the surroundings of the volcano. In 2022 the SEC gave rise to only two paroxysms in February and effusive activity in May-June and since November (still ongoing). Although the impacted area does not include permanent infrastructures it is of high tourist attraction. Hence, timely mapping of each lava flow field was mandatory for hazard mitigation. The high frequency of the 2021 paroxysms, up to two events in 24 hours, forced us to implement a multidisciplinary approach based on various remote sensing techniques, with different spatial resolutions and revisiting time. In particular, several satellite images were processed, depending on data availability and weather conditions. Data acquired by Sentinel-2 MSI, Skysat, Landsat-8 OLI, and TIRS allowed us to map the lava flow fields at a spatial resolution ranging from 0.5 to 90 meters. High-spatial resolution (from 4.5 up to 55 cm) DEMs and orthomosaics were also realized elaborating the visible and thermal images acquired through Unmanned Aerial Systems (UASs) surveys. Moreover, data acquired by the thermal cameras of the Istituto Nazionale di Geofisica e Vulcanologia permanent network were re-projected into the topography for analyzing the lava flow field evolution at 5-meter spatial resolution. These multi-platform remote sensing data allowed for mapping the lava flows and compiling a geodatabase reporting the main geometrical parameters (e.g. length, area, average thickness, and volume). The resulting multi-sensor methodology enabled, for the first time on Etna, to timely and accurately characterize frequently occurring effusive events.

How to cite: Proietti, C., De Beni, E., Cantarero, M., and Ricci, T.: Timely mapping and quantification of volcanological parameters: the 2021-2022 Etna lava flows, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15311, https://doi.org/10.5194/egusphere-egu23-15311, 2023.

EGU23-15349 | Posters on site | GI6.3

Satellite Investigation to study POcket BEach Dynamics in Malta. The SIPOBED project

Luciano Galone, Emanuele Colica, Peter Iregbeyen, Luca Piroddi, Adam Gauci, Alan Deidun, Gianluca Valentino, and Sebastiano D'Amico

Pocket Beaches are small beaches limited by natural headlands, strongly jutting into the sea, free from direct sedimentary contributions that are not eroded from back-shore cliffs. Malta’s pocket beaches are one of the most significant geomorphologic features of the archipelago. They play an important role for a variety of ecological and economic reasons. In this sense, sediment (mostly sand) dynamics is the most relevant factor to consider in the beach system. Sediment movement can be driven by a variety of factors, including wave action, currents, wind and direct and indirect anthropic action, leading to extreme morphological modifications in some cases.

The SIPOBED (Satellite Investigation to study POcket BEach Dynamics) project seeks to develop a reliable and cost-effective tool capable of monitoring sediment dynamics using satellite and other remote sensing data in several selected Maltase Pocket Beach systems, by reconstructing the volume and distribution of sediment of the beaches system through time.

The monitoring of sandy coastal zones requires the analysis of sediment dynamics in the entire beach system, from the coastal dunes to the closure depth, where the influence of sea waves on the seabed is low. SIPOBED uses Interferometric SAR and Light Detection and Ranging (LIDAR) derived Digital Elevation Models (DEMs) to study the inland system dynamics. The DEMs are used to improve the co-registration of temporal SAR imagery and detect subtle changes between acquisitions. The underwater sediment dynamics monitoring is approached by tracking bathymetric changes using multispectral satellite and unmanned aerial vehicle (UAV) images. In situ bathymetric data is essential for calibrating and validating the model. This methodology allows for more frequent and cost-effective monitoring of changes in both the dune-beach system and the ocean floor compared to classical approaches, such as in situ topographic surveys and ship-based sonar surveys. The project also aims to determine the bedrock depth and geometry at the lower limit of the pocket beach system using near-surface geophysical techniques.

The monitoring of Maltese sandy coastal beaches can provide insights into the factors influencing sediment dynamics and improve our understanding of the processes that shape and reshape pocket beaches over time. The results of the SIPOBED project will contribute to developing a risk assessment and monitoring tool that combines the sediment dynamics process with their potential local impacts, resulting in a powerful instrument for decision-makers.

The SIPOBED project is financed by the Malta Council for Science and Technology (MCST, https://mcst.gov.mt/) through the Space Research Fund (Building capacity in the downstream Earth Observation Sector), a programme supported by the European Space Agency.

How to cite: Galone, L., Colica, E., Iregbeyen, P., Piroddi, L., Gauci, A., Deidun, A., Valentino, G., and D'Amico, S.: Satellite Investigation to study POcket BEach Dynamics in Malta. The SIPOBED project, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15349, https://doi.org/10.5194/egusphere-egu23-15349, 2023.

EGU23-15413 | Orals | GI6.3

SAR localization of passive RFID tags under snow and vegetation using a mobile reader antenna

Mathieu Le Breton, Arthur Charléty, Nicolas Grunbaum, Éric Larose, and Laurent Baillet

Passive RFID tags are opening new capabilities of monitoring in geoscience [1], applied to landslide [2-3], snowpack [4] or riverine pebble monitoring. This study investigate the ability to localize passive RFID tags (working at 868 MHz) from the air in harsh conditions met in natural areas outdoors. The tags are localized with the synthetic aperture radar method (SAR) with a mobile reader from above, installed on a rail and equiped with a differential GNSS. The tags are localized either directly on the ground, under a vegetal cover, or under a snow cover, and the localization accuracy is evaluated in each case. This technique opens the possibility to monitor ground displacement even under snow or vegetal coverage, that challenge most of existing displacement measurement techniques.