Estimation Tool for Spatial Prediction Models

Authors

Project of the course Geosoftware 2 at the Institute of Geoinformatics by Jakob Danel, Fabian Schumacher, Thalis Goldschmidt, Henning Sander and Frederick Bruch

Abstract

Machine learning methods have become very popular for spatial prediction efforts such as classifying remote sensing images, especially because of their ability to learn non-linear relationships and thereby solve more complex classifications tasks. A underestimated issue is that machine learning algorithms can only provide meaningful predictions when applied to data that is similar to the data they were trained on (Meyer and Pebesma, 2021). ”Similar” here refers to the value ranges of the predictor variables (such as different bands of the remote sensing image). When applying a trained machine learning algorithm to a new geographic area, it is unclear whether or not the pixels properties in that area are similar enough to the training data to enable a reliable classification.

Area Of Applicability (AOA)

The Area Of Applicability is a method developed by Meyer and Pebesma (2021) to delineate areas in spatial data (here remote sensing images) that can be assumed to be areas the machine learning model can reliably be applied to. The AOA provides important additional information that should be communicated when applying machine learning methods to spatial prediction tasks, especially when predicting on a large or even global scale when training data are not evenly distributed over the target area.

Aim of the tool

The tool combines all the steps needed to perform a land use/land cover classification (generation of satellite images, model training and prediction). In particular, it is designed to extend the previous steps by the AOA and adopt this method into the typical workflow of a remote scientist/researcher without having to deal with its concrete implementation. Besides delineating such an area of applicability (AOA), this tool can also be used to point to areas where collecting additional training data is needed to train a more applicable model.

Target group

Researchers and users of remote sensing methods who want to

use machine learning for land use classifications
work with sentinel-2 data
know how to train and apply machine learning models, but are unable or unwilling to focus on understanding and implementing the Area of Applicability
work with large-scale mapping/modeling applications, but lack the necessary hardware to perform machine learning

How does the software work?

The user has the possibility to select a model to work with. He can either upload his own model via an upload button or create a new model in order to train it with a selectable machine-learning algorithm. Depending on his choice, only specific parts of the software will be executed.

Input

Area of interest: The area for which the land use classification and the aoa are to be calculated.
Training data or model: If a new model should be created, training data must be uploaded. Otherwise a model has to be uploaoded by the user.
Machine learning algorithm and hyperparameters: The new model must be trained. For this, the user can choose between two machine-learning algorithms and, if desired, also pass hyperparameters.
Time period: In that period, a search is made for available sentinel-2 images.
Bands/ predictors: All bands/predictors to be included in the sentinel images.
Resolution: Resolution of the sentinel images to be generated.
Maximum cloud cover: The satellite imagery search is filtered by maximum cloud cover.

Part 1: Satellite image generation (with R)

Generation of a Sentinel-2 satellite image for the area of interest (Sentinel Image (AOI))

Based on the user inputs (area of interest (AOI) , time period and cloud cover), the Spatial Temporal Asset Catalog (STAC) is searched for matching Sentinel-2 satellite images.
For each Sentinel-2 image found, all bands (except B10) are available for download. We only continue to work with those that have been pre-selected by the user.
If many images are found, we limit ourselves to 400 for further calculations.
All images (max 400) are now superimposed and for each pixel the median is calculated over all images for each band.
This can be helpful to avoid the problem of cloud cover and other interfering factors. In other words, the more images that can be found, the more likely it is to get a good image for model training and LULC classification.

Generation of a Sentinel-2 satellite image for the areas where the training data is located (Sentinel Image (training area))

The generation of a Sentinel-2 satellite image for the areas where the training data is located is only done if the user chose to create a new model and therefore has uploaded training data.
It works analogously to the generation of the Sentinel-2 image for the AOI. Instead of filtering by the AOI, it filters by the geometry of the training polygons. Pixels outside the polygons are set to NA.

Part 2: Calculation of indices (with R)

Additional indices can only be checked if the necessary bands for the calculations have also been selected. Then they are calculated and also used as predictors for further model training.

Available indices:
- NDVI, NDVI_sd_3x3, NDVI_sd_5x5
- BSI
- BAEI

Part 3: Model training (with R)

If the user selects to work with his own model, no further model training is needed. If the user selects to create a new model, some additional steps must be performed to obtain valid training data. The generated sentinel image of the training areas (consisting of all selected bands) is now combined with the information from the uploaded training data. Each pixel completely covered by a training polygon is assigned the class of the polygon. As a result we get a dataset of all overlaid pixels, their assigned class and spectral information that we can now use to train the model.

The user can choose whether he wants to train the model with an random forest algorithm or with a support vector machine. For both, hyperparameters can be set. The models performance is validated with a spatial cross validation method, omitting whole training polygons.

Part 4: Prediction and AOA (with R)

With the help of the trained model and the generated sentinel image for the AOI, a prediction is now calculated. In order to be able to make statements about the applicability of the model especially on unknown areas, the AOA is computed. In the areas where the model is not applicable according to the AOA, random points are generated that are suggested to the user as potential new locations for generating new training data. If this data is acquired in these areas and incorporated into the model, better results could be obtained.

How to install and run the app

To make it as simple as possible we used Docker for the development. The only thing necessary to run this software, is to download this repository with git clone --recursive https://github.com/geo-tech-project/geotech.git and then run sudo docker-compose up in the command line interface. This command loads two images, one for the front and one for the backend, via dockerhub. It may take a while to load the images, as all dependencies (e.g., R packages) are being loaded as well. After loading, the application will start automatically. It is accessible over your own IP-adress and the :8780 port. Example: http://localhost:8780 or for our AWS instance: http://35.80.3.64:8780.

How to use the app

Main tool

The main tool is designed in such a way that the user can use it very easily. The user is guided step by step and can only proceed to the next step if the previous one has been carried out correctly. For each step there is an additional info button that displays important information as soon as you hover over it. When everything has been entered successfully, the calculations can be started. After the calculations have been executed and no errors have occurred, the user will be directed to the results page.

Demo

The demo page is structured exactly like the actual tool. However, all inputs have already been entered with default values. The user can view these entries, but not change them. He is only able to start the calculations by clicking on the Run demo button. The user should be redirected to the results page in less than 20 seconds.

Output of the results

On a new route, the following three results are visualised on a map:

Prediction: Land use/ land cover classification
Area of Applicabilty (AOA)
Further train areas

It is possible to show and hide the individual results using a checkbox and even to adjust their transparency. The underlying satellite images on which the calculations are based are not displayed on the map but can be downloaded in the same way as the other results via a download button. Please note that the sentinel image of the training areas can only be downloaded if training data has been submitted.

Unfortunately, the sentinel images do not contain any band names. However, they correspond to the order in which they can be selected. Example: Bands B03, B07, B05 and the addtional index BSI have been selected.
Then the order in the Tif would be as follows: 1 = B03, 2 = B05, 3 = B07, 4 = BSI.

How to test

To test this app you can proceed as follows:
Backend:
With your CLI go into your backend folder and run npm test.
Frontend:
With your CLI go into your frontend folder and run ng test.
R: The tests are written in the R package testthat.
Requirements:

Installation of R
Installation all R packages used in this project
Installation of Node.js

Proceed the following steps.

Make a clone of the backend repository
Navigate into the backend/test folder
Run node testR.js

Dependencies

The following packages are used in this project:

Frontend

@angular/animations: Angular - animations integration with web-animations
@angular/cdk: Angular Material Component Development Kit
@angular/common: Angular - commonly needed directives and services
@angular/compiler: Angular - the compiler library
@angular/core: Angular - the core framework
@angular/forms: Angular - directives and services for creating forms
@angular/material: Angular Material
@angular/platform-browser: Angular - library for using Angular in a web browser
@angular/platform-browser-dynamic: Angular - library for using Angular in a web browser with JIT compilation
@angular/router: Angular - the routing library
@asymmetrik/ngx-leaflet: Angular.io components for Leaflet
@asymmetrik/ngx-leaflet-draw: Angular.io component for the draw plugin for Leaflet
@creativebulma/bulma-tooltip: Display a tooltip attached to any kind of element, in different position.
@fortawesome/angular-fontawesome: Angular Fontawesome, an Angular library
@fortawesome/fontawesome-svg-core: The iconic font, CSS, and SVG framework
@fortawesome/free-brands-svg-icons: The iconic font, CSS, and SVG framework
@fortawesome/free-regular-svg-icons: The iconic font, CSS, and SVG framework
@fortawesome/free-solid-svg-icons: The iconic font, CSS, and SVG framework
ace-builds: Ace (Ajax.org Cloud9 Editor)
better-docs: JSdoc theme
bootstrap: The most popular front-end framework for developing responsive, mobile first projects on the web.
bulma: Modern CSS framework based on Flexbox
bulma-slider: Display classic slider more sexy, in different colors, sizes, and states
bulma-toast: Bulma's pure JavaScript extension to display toasts
chroma-js: JavaScript library for color conversions
font-awesome: The iconic font and CSS framework
geoblaze: Blazing Fast JavaScript Raster Processing Engine
georaster: Wrapper around Georeferenced Rasters like GeoTIFF, NetCDF, JPG, and PNG that provides a standard interface
georaster-layer-for-leaflet: Display GeoTIFFs and soon other types of raster on your Leaflet Map
leaflet: JavaScript library for mobile-friendly interactive maps
leaflet-draw: Vector drawing plugin for Leaflet
leaflet-geotiff: A LeafletJS plugin for displaying geoTIFF raster data.
leaflet-geotiff-2: A LeafletJS plugin for displaying geoTIFF raster data.
ng2-file-upload:
ngx-markdown: Angular library that uses marked to parse markdown to html combined with Prism.js for synthax highlights
rxjs: Reactive Extensions for modern JavaScript
tslib: Runtime library for TypeScript helper functions
typedoc: Create api documentation for TypeScript projects.
zone.js: Zones for JavaScript

Dev dependencies

@angular-devkit/build-angular: Angular Webpack Build Facade
@angular/cli: CLI tool for Angular
@angular/compiler-cli: Angular - the compiler CLI for Node.js
@types/jasmine: TypeScript definitions for Jasmine
@types/leaflet: TypeScript definitions for Leaflet.js
@types/leaflet-draw: TypeScript definitions for leaflet-draw
@types/node: TypeScript definitions for Node.js
jasmine-core: Official packaging of Jasmine's core files for use by Node.js projects.
karma: Spectacular Test Runner for JavaScript.
karma-chrome-launcher: A Karma plugin. Launcher for Chrome and Chrome Canary.
karma-coverage: A Karma plugin. Generate code coverage.
karma-jasmine: A Karma plugin - adapter for Jasmine testing framework.
karma-jasmine-html-reporter: A Karma plugin. Dynamically displays tests results at debug.html page
typescript: TypeScript is a language for application scale JavaScript development

Backend

axios: Promise based HTTP client for the browser and node.js
body-parser: Node.js body parsing middleware
chai: BDD/TDD assertion library for node.js and the browser. Test framework agnostic.
cors: Node.js CORS middleware
dotenv: Loads environment variables from .env file
express: Fast, unopinionated, minimalist web framework
mocha: simple, flexible, fun test framework
multer: Middleware for handling multipart/form-data.
ng2-file-upload: extension for multer to upload files to the server
nodemon: Simple monitor script for use during development of a node.js app.
r-integration: Simple portable library used to interact with pre-installed R compiler by running commands or scripts(files)
supertest: SuperAgent driven library for testing HTTP servers
swagger-ui-express: Swagger UI Express

R

terra: Spatial Data Analysis
rgdal: Bindings for the 'Geospatial' Data Abstraction Library
rgeos: Interface to Geometry Engine - Open Source ('GEOS')
rstac: Client Library for SpatioTemporal Asset Catalog
gdalcubes: Earth Observation Data Cubes from Satellite Image Collections
raster: Geographic Data Analysis and Modeling
caret: Classification and Regression Training
CAST: 'caret' Applications for Spatial-Temporal Models
lattice: Trellis Graphics for R
Orcs: Omnidirectional R Code Snippets
jsonlite: A Simple and Robust JSON Parser and Generator for R
tmap: Thematic Maps
latticeExtra: Extra Graphical Utilities Based on Lattice
doParallel: Foreach Parallel Adaptor for the 'parallel' Package
parallel
sp: Classes and Methods for Spatial Data
geojson: Classes for 'GeoJSON'
rjson: JSON for R
randomForest: Breiman and Cutler's Random Forests for Classification and Regression

Further documentation

The software can be split into two essential parts. The frontend was developed with the web framework Angular. The backend is setup as a Node.js application using the Express framework.

Frontend

Documentation of the frontend written in Angular, with HTML, CSS and TypeScript: Frontend

Backend

The backend can be devided into three parts. The first part are the R scripts that are used to perform the actual operations, e.g. generating the sentinel images or calculating the AOA. The second part is the API that establishes the connection between the back- and frontend. The last part is the Javascript code that sets up the API and connects to the R-part. Please note that the following links can only be used from the internet network of the University of Münster.

R-Scripts
API
Javascript

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Estimation Tool for Spatial Prediction Models

Table of contents

Authors

Abstract

Area Of Applicability (AOA)

Aim of the tool

Target group

How does the software work?

Input

Part 1: Satellite image generation (with R)

Generation of a Sentinel-2 satellite image for the area of interest (Sentinel Image (AOI))

Generation of a Sentinel-2 satellite image for the areas where the training data is located (Sentinel Image (training area))

Part 2: Calculation of indices (with R)

Part 3: Model training (with R)

Part 4: Prediction and AOA (with R)

How to install and run the app

How to use the app

Main tool

Demo

Output of the results

How to test

Dependencies

Frontend

Dev dependencies

Backend

R

Further documentation

Frontend

Backend

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Estimation Tool for Spatial Prediction Models

Table of contents

Authors

Abstract

Area Of Applicability (AOA)

Aim of the tool

Target group

How does the software work?

Input

Part 1: Satellite image generation (with R)

Generation of a Sentinel-2 satellite image for the area of interest (Sentinel Image (AOI))

Generation of a Sentinel-2 satellite image for the areas where the training data is located (Sentinel Image (training area))

Part 2: Calculation of indices (with R)

Part 3: Model training (with R)

Part 4: Prediction and AOA (with R)

How to install and run the app

How to use the app

Main tool

Demo

Output of the results

How to test

Dependencies

Frontend

Dev dependencies

Backend

R

Further documentation

Frontend

Backend

License