James Hughes
Jim Hughes is a core committer for GeoMesa, which leverages HBase, Accumulo and other distributed systems to provide distributed computation and query capabilities. He is also a committer for LocationTech JTS and SFCurve.
Sessions
LocationTech GeoMesa is a suite of tools for working with big geo-spatial data by leveraging big data technologies like Apache projects like HBase, Kafka, NiFi, and Spark to enable persistence, streaming, ETL, and analysis.
In this talk, we will give background information about the core capabilities of GeoMesa and additionally discuss recent improvements in GeoMesa 3.x over the last year. These changes include support for newer versions of Scala (enabling Spark 3.x support), Kafka, and HBase/Accumulo. Other improvements include new features which help improve use of GeoMesa components in Docker and Kubernetes.
As IoT-based use cases increase, so has the need to create real-time geospatial views of the data generated. In this talk, we will describe how LocationTech GeoMesa integrates with Apache Kafka and Apache NiFi to enable spatial data streaming and data management.
The first part of the talk will dive into the details of indexing observation data for entities moving through space and time. Examples will show how GeoServer can be polled to show a live picture of multiple moving entities.
The second part of the talk will focus on using NiFi to route data through an enterprise. Apache NiFi provides a visual programming interface to create, represent, and monitor data flows. The GeoMesa-NiFi project provides Processors which allow for handling spatial data with GeoMesa and GeoTools DataStores.
Many of the Apache projects serving the big data space do not come with out of the box support for geospatial data types like points, lines, and polygons. LocationTech GeoMesa has provided add-on support to Apache database projects such as Accumulo, Cassandra, HBase, and Redis crafting spatial and spatio-temporal keys. In addition to distributed databases, GeoMesa has enables spatial storage in many of the popular Apache file format projects such as Arrow, Avro, Orc, and Parquet. This talk will review the basics of big geo data persistence either in a data lake or in a database, and provide an overview of the benefits (and limitations) of each technology.