Why faster processing of geospatial data is essential for US defense

The current state of international affairs and conflicts in multiple parts of the world requires a constant state of vigilance and preparedness for all U.S. defense and intelligence operations. Part of this preparation involves having a deep and accurate understanding of the ever-evolving geospatial landscape, in as close to real-time as possible.

Geospatial knowledge has always been a critical component of military intelligence, with the goal of minimizing the time window of data capture to analysis to dissemination. Today this time lapse is widening as the volume, variety and velocity of data is increasing at an unprecedented rate. More information is, ironically, leading to slower (and less accurate) decision-making. At the highest level, this latency can have an adverse impact on the geo-intelligence needed for any type of military operation.

LiDAR, hyperspectral

There are various types of data that are often kept in disparate databases - including LiDAR (light detection and ranging), multispectral, hyperspectral, SAR (synthetic aperture radar), aerial and other modalities. The high costs, inflexibility and inability to share geospatial data between systems continues to be a major roadblock because geospatial analysts are forced to use inefficient, time-consuming and error-prone data transfer methods.

Any single dataset only gives a limited amount of information, and can be used only for a finite number of purposes. Integrating and linking multiple datasets or derived data products (related to a specific area of interest) are essential to gaining more insights, answering additional questions and making better decisions. In fact, geospatial data is most useful when multiple forms are shared, analyzed and used in combination with one another.

For example, a U.S. professor recently received a prestigious Fulbright U.S. Scholar Award for his humanitarian-focused geospatial research. This project involves synthesizing a vast amount of geospatial data including cell phone GPS tracking, remote sensing imagery, location references in text and more, to better understand and address the movement of Ukrainian refugees in Poland.

However, a data analyst has to download multiple file formats and develop their own processing pipelines in order to synthesize and enrich data. Before starting a processing task, an analyst needs to search multiple databases to find the data he/she needs and then download this complex data in multiple formats as inputs into a processing pipeline, with each input requiring its own API. In a defense example, target detection using hyperspectral data requires a custom processing pipeline that also incorporates aerial imagery for context and possibly point clouds for advanced 3-D visualization.

This approach limits the ability to do rapid processing across various sources. There isn’t a single place for all geospatial analytics and machine learning - including point cloud exploitation, image processing, feature detection, change detection or support for digital twins - which prevents deeper contextual understanding.

Faster assessments

Rapid processing from various sources is the key to achieving the type of integrated richness that supports faster assessments. Building on the defense target example above, beyond basic data access and capture, this type of analysis adds another level of complexity, because there are disparate closed source and open-source tools to analyze each different data type. Currently advanced imagery analytics require custom tools with limited API integration. Imagine if there were a single API that optimized data access and could integrate into these tools.

Finally, today’s geospatial analysts face restrictive compute limitations. Specifically, geospatial analysts often have to spin up clusters which slows time to insights, and limits the ability to parallelize operations. Advances in serverless architectures do away with this need, allowing developers to easily spin up and down applications without concerns about, or needing to wait for, hardware access.

We need a better approach, one that delivers insights in minutes versus days, and achieved through:

— A single platform to support all data modalities - there needs to be an efficient and unified method to store and analyze all geospatial data and derivative results;

— Distributed and highly scalable computing - allowing geospatial analysts to fully embrace the cloud to run any pipeline at scale without needing to initiate and activate clusters; and

— All of this needs to be accomplished while protecting sensitive information and ensuring data integrity. There should be compliant and isolated on-premises capabilities to fulfill data sovereignty requirements for both your mission and partners.

Geospatial knowledge continues to offer a vast repository of insights that can be used for the betterment of defense and intelligence operations, and at a higher level, human society. However, the volume, variety of velocity of this vast data requires a new approach to manage it cohesively, since current approaches are too fragmented.

Doing so will be the key to maximizing the power of geospatial information in the coming years, hopefully transforming data into life-changing intelligence within increasingly tightening timespans.

Norman Barker is the VP of Geospatial at TileDB. Prior to joining TileDB, Norman focused on spatial indexing and image processing, and held engineering positions at Cloudant, IBM and Mapbox.