GeoPython 2022

Large-scale geospatial and temporal dataset
2022-06-20, 09:15–09:45, Room 1

Spatial and temporal data is in high demand by Data Scientists and crops domain experts, wishing to quickly develop models to help farmers optimize their crops production in a climate friendly way. A way to efficiently create, save and load the data is necessary. Our solution is to store the data in one large multi-dimensional geospatial and temporal dataset.


Currently, the spatial and temporal data is handled separately in the different departments. We wish to create a storage for the data where everyone has easy access to a unique and high-quality dataset.

The type of spatial and temporal data is e.g.:

Satellite images:
- Sentinel 1
- Sentinel 2
- Calculated cloud masks

Other georeferenced data:
- Harvest yield
- Elevation maps
- Weather data
- Annotations like crops falling over (lodging).

In agriculture we have a special case where we’re interested in small areas in relation to the area covered by satellite images. The small areas being the crop fields. The farmers are interested in both field level and pixel level mathematical models to utilize precision farming and explain rela-tions in their fields.
The dataset was created by arranging the geospatial data by UTM Zone into a Zarr dataset with a chunk size of 10x10km for 7 days and a resolution of 10x10m and 1 day. Saving and loading was done using Dask arrays and Xarray. The solution enables us to easily add new dimensions to the dataset and navigate to the relevant section automatically, providing us easy handling and scaling of the large multi-dimensional dataset.