Spatial Data Analysis

Many of today’s “big data” innovations and ideas are rooted in geospatial data sources. Most people always carry at least one device, such as a smartphone, that tracks their location the result is geospatial data. Entire industries like the Internet of Things (IoT), drones, and autonomous cars rely on this sort of reliable, real-time geospatial data. Similarly, entire concepts like “smart cities” are built around the idea of using geospatial data. Geospatial intelligence is actionable knowledge, a process, and a profession. It is the ability to describe, understand, and interpret so as to anticipate the human impact for an event or action within a spatiotemporal environment. It is also the ability to identify, collect, store, and manipulate data to create geospatial knowledge through critical thinking, geospatial reasoning, and analytical techniques. Finally, it is the ability to present knowledge in a way that is appropriate to decision-making environment. Geospatial intelligence (GEOINT) is a broad field that encompasses the intersection of geospatial data with social, political, environmental and numerous other factors. The Intelligence Community defines geospatial intelligence as “the use and analysis of geospatial information to assess geographically referenced activities on Earth.”

Uses of Geospatial Intelligence

  • The role of machine learning and GEOINT in disaster response
  • Open geospatial data platforms and food scarcity
  • Interoperability for GEOINT applications and data in the military
  • The role of data stewardship in crisis mapping

Machine Learning and GEOINT

Geospatial intelligence software, augmented with machine learning, could help to map changes in terrain and structures, making disaster response projects more efficient and more effective. Several organizations are looking toward algorithms to help create more timely and accurate maps. One example is the SpaceNet “Road Detection and Routing Challenge,” a $50,000 competition to develop an automated method for extracting information about road networks. Crowdsourced data proved to be an invaluable resource in the response to Hurricane Maria in Puerto Rico, but the successful implementation of machine learning could yield faster and more accurate maps to help emergency personnel find people in need or identify the best routes for delivering supplies.

Data Collection

Imagery

This type of data is collected by a system of sensors and platforms. Sensors collect the data, and platforms are the vehicles or objects to which sensors are attached. The sensor or platform used depends on the type of data to be collected, the conditions in which it is collected, and the purpose for which it will be used.

Platform

This section addresses satellite, aircraft, ground-based, and sea-based platforms. Aircraft, drones, aerostats, balloons, and dirigibles are referred to as airborne platforms. Airborne, ground-based, and sea-based platforms may be manned or unmanned/unattended. GIS programming involves creating, extending, utilizing, GIS or web mapping solutions to solve specific problems, build complete applications, or consume or produce data and geospatial processing services.

GIS Programming

Web GIS programming involves creating, extending, utilizing, Web GIS or web mapping solutions to solve specific problems, build complete applications, or consume or produce data and geospatial processing services. In addition, a number of Web GIS software options offer application programming interfaces (APIs) that provide a means by which developers can leverage the published data and processing services of others to build and customize applications through standardized interfaces with external web GIS software, data, and services. (Link is external) provides a list of commonly used open source and proprietary software APIs used in web GIS programming projects focused on mashups. Among the most popular are the Google Maps API, Mapbox GL JS, OpenLayers, Esri ArcGIS API for Javascript, ArcGIS REST API, ArcGIS Python API, ArcGIS Runtime SDKs (software development kits), and Leaflet. For example, the Esri GeoServices REST specification provides a standard way for clients to communicate with servers through the REST architecture using Restful web services and URLs. In web GIS programming, such hosted Web GIS software libraries and other scripting language libraries such as JQuery provide globally available, reliable collections of modular code that can be used by developers to simplify writing programs. Open Layers and Leaflet are examples of open-source libraries used to create web map mashups.

Spatial Data Science (SDS)

This is a subset of Data Science that focuses on the special characteristics of spatial data, using modelling to know where and why things happen. Geographic information systems (GIS) applies to a wide range of users & use cases, yet is one of those strange anomalies that, despite its value spanning many industries, has remained a niche field – often siloed from other business units. GIS typically refers to varied types of information systems such as websites, apps, or databases that store different types of spatial data. With new types of users such as Data Scientists, GIS is starting to happen more outside of traditional GIS tools – allowing more sophisticated spatial analyses to take place in connection with new Data Science & Big Data solutions. This shift is allowing Spatial Data Science to emerge as a discipline with greater interactivity with Open Source & Cloud technologies.

Using R as GIS

Until recently, serious work with geospatial data required an often-proprietary desktop GIS (Geographic Information System), such as ArcGIS. Now, however, GIS capabilities in R have greatly advanced. In many ways, the benefits of using R over a desktop GIS are similar to the benefits of using R over Excel for data analysis. R is free and open source, which makes it easier and cheaper to get started, compared to the expensive licenses needed for many desktop GIS. This has helped spread geospatial data analysis beyond the domain of only GIS specialists, thereby opening opportunity to a wider ecosystem of contributors. As a full-fledged programming language, the command line interface of R has greater flexibility than the point-and-click interface of a desktop GIS. This means there’s no restriction on what’s ultimately achievable. R is reproducible in a way that point-and-click interfaces are not. This is, of course, key to the scientific process. The greater shareability of R scripts and packages encourages a faster development cycle and a more collaborative workflow.

Using Python as GIS

Geospatial analysis, and python have experienced together over the last decade and more. Initially, this marriage between a computer language and geospatial platforms occurred when major GIS platforms such as ArcGIS and QGIS began to adopt Python as the main scripting, toolmaking, and analytical language. Perhaps for users the main reason for the adoption of Python has been because of the fact that Python is easy to learn, good at data manipulation, and has many useful libraries that are apt or could be easily adapted for geospatial analysis. Pandas makes data manipulation, analysis, and data handling far easier than some other languages, while GeoPandas specifically focuses on making the benefits of Pandas available in a geospatial format using common spatial objects and adding capabilities in interactive plotting and performance. The fact that many Python libraries are available and the list is growing helps users to have many options to leverage existing code and build more powerful features in their tools. Platforms such as QGIS allow users to input their own extensions that are built in Python, further encouraging development and use of Python among GIS specialists. This growth highlights that as GIS users and geospatial analysts develop their skills, Python might be the best language to focus on.