Dustin Sampson


Efficiencies of Scale with Imagery Pipelines, Cloud Optimized GeoTIFFs, and SpatioTemporal Asset Catalogs
Paul Trudt, Dustin Sampson

Bayer Crop Science recently unified internal imagery platform capabilities by completing a full re-write of core imagery APIs to leverage the performance gains offered by Cloud Optimized GeoTIFFs (COGs) with the efficiencies and extensibility of the SpatioTemporal Asset Catalog (STAC). By standardizing imagery pipeline outputs on COGs, all developers and imagery scientists at Bayer Crop Science have access to the full spatial imagery catalog as STAC Item/Asset records and can implement common file access patterns. One potential benefit of adopting STAC-accessible COGs as a standard pipeline output is the ability to squeeze out unnecessary data transfers for local writes and reads of unwanted peripheral pixels at-scale for imagery-based ML training and processing. To do this, our imagery team developed a new pilot Imagery-as-Array API to return band-specific AOI targeted range-and-column pixels as numPy arrays for processing and analysis. By implementing data transfers, reads, and writes for only targeted pixels, the resulting milliseconds saved here and there for 1000’s of images can add up to hours of unrealized network and compute time in very short order and lead to faster iterations of higher quality. The overall re-write effort aligned with Bayer global digital transformation objectives and firmly established the imagery platform as a scalable and durable pivot between imagery capture, post-processing, and decision-science based analytics to help drive future research and commercial advancements. This presentation will provide an overview of the Bayer Crop Science imagery ecosystem and the incremental efficiencies gained from integrating COGs with other open source software capabilities.

Use Cases and Applications