2021-09-29, 14:00–14:30, Humahuaca
Data size has far exceeded compute speed. Most complex tasks are followed by long run-times before we get any answers. GPUs are one of the easiest ways to parallelize large computations, with considerable time savings. cuSpatial is a FOSS library for accelerating spatial workflows on the GPU.
cuSpatial is an open-source indexing, transform, and geometry library being developed within the RAPIDS ecosystem. RAPIDS is a collection of FOSS high-performance GPU data science libraries for Data Science, Machine Learning, numerical analysis, and geospatial analytics.
cuSpatial is in development and enables GPU acceleration for a number of common GIS workflows including:
- GeoPandas integration
- fast I/O with Apache Arrow and Apache Parquet
- cubic spline fitting
- hausdorff-distance based clustering
- haversine distance and geographic to euclidean projection
This is a medium-level technical talk describing the relevance, development of, and use of the cuSpatial library. The talk will progress as follows:
1. Me, NVIDIA
2. Why GPUs? Growing datasets and stagnating performance.
3. RAPIDS: A suite of GPU accelerated FOSS data science libraries
4. cuSpatial: A python library that integrates with GeoPandas and uses the GPU
5. Examples and uses of cuSpatial APIs
6. Interesting future cuSpatial APIs
7. Using cuSpatial with GeoPandas to read and write data
8. Using cuSpatial with cudf's parquet reader for Arrow-based I/O
9. Premade benchmark examples comparing cuSpatial to CPU based methods
10. Call to action to give it a try, give feedback, or contribute to our github repository.
Data collection, data sharing, data science, open data, big data, data exploitation platformsLevel –
2 - Basic. General basic knowledge is required.Language of the Presentation –
Thomson Comer is currently writing open-source GPU accelerated software for NVIDIA. He has ten years experience consulting at a startup incubator, and an M.S. in Computer Science.