Python Machine Learning Conference & GeoPython 2020

»A semantic querying system for Earth observation imagery«
2020-09-22, 11:00–11:30, Room 2 is a first step towards a semantic querying system for Earth observation imagery. It implements a Python-based inference engine that can be used to directly infer information from EO imagery without the need for domain knowledge or programming skills.

Earth observation (EO) data and subsequently produced information are essential for understanding, modelling, and predicting natural and human-related processes. With the free and open data policy of Landsat and Copernicus, we have seen a massive increase in volume (e.g. > 3TB of data every day from Copernicus satellites) as well as improved accessibility of these kinds of data. Technological developments, such as data cubes (e.g. OpenDataCube), scalable cloud-based analysis platforms (e.g. Google Earth Engine) and standardized data access APIs (e.g. OpenEO), are easing EO data retrieval and increasing possible processing speeds. This creates opportunities for a wide range of application domains, including ones, which did not previously utilise EO data.

However, an important, remaining challenge is that EO imagery cannot be directly translated into information. Imagery lacks inherent semantic meaning, thus requiring some degree of interpretation. For example, a multi-spectral Earth observation image consists of an array of digital numbers, while a user expects a semantic categorical value describing a real-world entity such as "vegetation". This limits the exploitation of the value of EO imagery because interpretation is inherently ill-posed, requires advanced domain expertise and complicates data-driven, machine-based information production.

The project addressed these issues by taking a first step in developing a semantic querying system for EO imagery and testing it using all Sentinel-2 imagery covering Austria as a first use case. The project implemented an AI-based image understanding system enabling a variety of users to directly infer information from semantically enriched EO imagery without the need for either domain knowledge or programming skills. Queries can be expressed in a newly developed, intuitive semantic query language, and graphically constructed inside a user-friendly Web-frontend. The novelty of the query language is to separate the extraction of semantic concepts from their analysis as multi-dimensional data structure. The queries are evaluated inside an in-house developed inference engine, which abstracts image-domain terminology, retrieves and wrangles the necessary data, and returns the requested information to the user. The inference engine is written in Python, and mostly built on top of the xarray library.

The proposed talk will introduce the developed semantic querying system, emphasize its importance for information extraction from EO imagery, and describe in more detail the ins and outs of the technical implementation.