GeoPython 2022

Who Said Wrangling Geospatial Data at Scale was Easy?
2022-06-21, 14:00–14:30, Room 2

In this talk, I’ll briefly introduce the various modes in which geospatial data comes. I’ll also focus on the most efficient ways to condense large amounts of geospatial data into analyzable chunks, to speed up data processing and analysis.


If you have ever worked with Census Data, you may be recalling nightmares of hours spent staring at the data and finding it impossible to download, store or convert to a sensible format to begin your analysis. And Census Data is not even unstructured data!

Geospatial Data comes in various formats - GeoJSON, Parquet, Shapefile, GeoTIFF, GeoPackage, etc. But what are the most efficient ways to convert the data into formats that are easy to understand, work with, transfer, and ultimately analyze? Then throw in petabytes worth of data and you hit the challenge of wrangling geospatial data at scale.

This talk will walk through some of the best ways to handle geospatial data at scale, with a focus on:

  • The xarray-spatial library for raster-based spatial analysis.
  • The RTXpy library for GPU-powered spatial analysis.
  • Microsoft Planetary Computer examples of geospatial data processing.