Skip to content
Pedro R. Andrade edited this page May 29, 2017 · 33 revisions

Neighborhoods from Geospatial Data

Pedro R. Andrade

Summary

Introduction

Neighborhoods describe how spatial entities are connected. They summarize (possibly complex) geospatial relations into a simple representation that makes the simulation faster. This tutorial describes how to create Generalized Proximity Matrixes (GPM) within TerraME. GPM is based on the idea that Euclidean spaces are not enough to describe the relations that take place within the geographical space GPM is composed by a set of strategies that try to capture such spatial warp, computing operations over sets of spatial data.

In the next sections, we will describe the basic structure of the implementation and present some examples of creating proximity matrixes. For more information about GPM, see Aguiar et al. (2003); Modeling Spatial Relations by Generalized Proximity Matrices. Proceedings of V Brazilian Symposium in Geoinformatics (GeoInfo'03).

Before starting, we need to use some spatial data available within gpm package (documentation available here):

  • a set of lines, representing roads;
  • a set of points, representing the center of communities;
  • a set of polygons, representing farms.
  • a set of cells, create from the farms using script cells.lua available in the data directory of gpm package.

Note that all them are declared using geometry = true to load their geometries.

roads       = CellularSpace{file = filePath("roads.shp", "gpm"),       geometry = true}
communities = CellularSpace{file = filePath("communities.shp", "gpm"), geometry = true}
farms       = CellularSpace{file = filePath("farms.shp", "gpm"),       geometry = true}
cells       = CellularSpace{file = filePath("cells.shp", "gpm"),       geometry = true}

Figure below shows them.

Basic Strategies

This section presents the strategies that use Euclidean spaces to create GPM. All the code below uses gpm package, that can be loaded using:

import("gpm")

The available strategies depends on the spatial representation of the data. They are summarized in the table below.

From To Points To Lines To Polygons
Points Distance, Network Distance, Network Distance, Network
Lines Distance, Network Distance, Network Distance, Length, Network
Polygons Contains, Distance, Network Distance, Network Area, Border, Distance, Network

Area

The first example of GPM starts with a strategy that uses the intersection area to create relations between cells and polygons. Two spatial objects are connected if they have some intersection area. We can declare a GPM using cells as origin, farms as destination, and area as strategy to create relations based on the intersection area.

gpm = GPM{
    origin = cells,
    strategy = "area",
    destination = farms
}

This GPM can be saved as a .gal file by using save(), presented in more details in last section. This neighborhood can then be loaded into a simulation using CellularSpace:loadNeighborhood().

gpm:save("cell-neighborhood.gpm")

GPM can also be used to create new attributes for the CellularSpace using fill(). For example, if we want to count how many polygons cover each cell, we can use strategy = "count". In this case, we will also use an optional argument max = 5 to indicate that, if a given cell has more than five neighbors, it will use five as the output value.

gpm:fill{
    strategy = "count",
    attribute = "quantity",
    max = 5
}

We can then create a Map to visualize the output.

map = Map{
    target = gpm.origin,
    select = "quantity",
    min = 0,
    max = 5,
    slices = 6,
    color = "Reds"
}

gpm:fill{
    strategy = "maximum",
    attribute = "max",
    copy = {farm = "id"}
}
-- to paint them with different colores, we use the rest of division by 9
forEachCell(gpm.origin, function(cell)
    cell.farm = tonumber(cell.farm) % 9
end)

map = Map{
    target = gpm.origin,
    select = "farm",
    min = 0,
    max = 8,
    slices = 9,
    color = "Set1"
}

Distance

The second strategy uses the centroids to create relations between points that are closer than 4000m. To accomplish that, we use strategy "distance".

gpm = GPM{
    origin = cells,
    destination = communities,
    strategy = "distance"
}

gpm:fill{
    strategy = "minimum",
    attribute = "dist",
    copy = "LOCALIDADE"
}

map1 = Map{
    target = cells,
    select = "dist",
    slices = 8,
    min = 0,
    max = 7000,
    color = "YlOrRd",
    invert = true
}

map2 = Map{
    target = cells,
    select = "LOCALIDADE",
    value = {"Palhauzinho", "Santa Rosa", "Garrafao", "Mojui dos Campos"},
    color = "Set1"
}

gpm:fill{
    strategy = "all",
    attribute = "d"
}

for i = 0, 3 do
    Map{
        target = cells,
        select = "d_"..i,
        slices = 8,
        min = 0,
        max = 10000,
        color = "YlOrRd",
        invert = true
    }
end
gpm = GPM{
    origin = cells,
    destination = communities,
    distance = 4000
}

gpm:fill{
    strategy = "count",
    attribute = "quantity"
}

gpm:fill{
    strategy = "minimum",
    attribute = "dist",
    dummy = 7000,
    copy = "LOCALIDADE"
}

-- as there is a limit of 4000m, those cells that are far
-- from this distance will not have attribute LOCALIDADE
forEachCell(cells, function(cell)
    if not cell.LOCALIDADE then
        cell.LOCALIDADE = "<none>"
    end
end)

map1 = Map{
    target = cells,
    select = "quantity",
    min = 0,
    max = 5,
    slices = 6,
    color = "RdPu"
}

map2 = Map{
    target = cells,
    select = "dist",
    slices = 8,
    min = 0,
    max = 7000,
    color = "YlOrRd",
    invert = true
}

map3 = Map{
    target = cells,
    select = "LOCALIDADE",
    value = {"Palhauzinho", "Santa Rosa", "Garrafao", "Mojui dos Campos", "<none>"},
    color = "Set1"
}

Length

The strategy presented in this section computes neighborhoods based on the intersection between lines and cells. Each cell is connected with the line segments that intersects it by using strategy "length". It gets a layer of cells and a layer of lines as arguments and returns a function used to effectively create the GPM. The code below creates a neighborhood between layer "cells" and layer "rodovias".

gpm = GPM{
    origin = cells,
    strategy = "length",
    destination = roads
}

gpm:fill{
    strategy = "count",
    attribute = "quantity",
    max = 1
}

map = Map{
    target = cells,
    select = "quantity",
    value = {0, 1},
    label = {"0", "1 or more"},
    color = {"gray", "blue"}
}

Contains

In this section, we present a function that computes neighborhoods between a layer of polygons and a layer of points based on spatial relation "contains". A cell is connected to the points located inside its area. The code below creates a neighborhood between the layer "cells" and the layer "comunidades".

gpm = GPM{
    origin = cells,
    strategy = "contains",
    destination = communities
}

gpm:fill{
    strategy = "count",
    attribute = "quantity"
}

map = Map{
    target = cells,
    select = "quantity",
    value = {0, 1},
    color = {"lightGray", "blue"}
}

Connecting lines with polygons

In this section, we present an strategy to compute neighborhoods between a layer of lines and a layer of polygons, in which each line has as neighbors the polygons intersected by it. The code below creates a neighborhood between the layer "rodovias" and the layer "lotes".

Border

local states = CellularSpace{
    file = filePath("partofbrazil.shp", "gpm"),
    geometry = true
}
local gpm = GPM{
    origin = states,
    strategy = "border",
    progress = false
}

This example uses the neighborhood relations directly.

forEachOrderedElement(gpm.neighbor, function(idx, neigh)
    print(states:get(idx).name)

    forEachOrderedElement(neigh, function(midx, weight)
        print("\t"..states:get(midx).name.." ("..string.format("%.2f", weight)..")")
    end)
end)

This script will produce the following output:

MINAS GERAIS
	RIO DE JANEIRO (0.10)
	SAO PAULO (0.25)
	ESPIRITO SANTO (0.12)
PARANA
	SAO PAULO (0.32)
RIO DE JANEIRO
	MINAS GERAIS (0.29)
	SAO PAULO (0.14)
	ESPIRITO SANTO (0.09)
SAO PAULO
	MINAS GERAIS (0.38)
	PARANA (0.26)
	RIO DE JANEIRO (0.07)
ESPIRITO SANTO
	MINAS GERAIS (0.48)
	RIO DE JANEIRO (0.11)

Network

The last strategy presented in this vignette computes neighborhoods based on the distance through a given network represented by a set of lines. The original data has to be very well represented, with the starting and ending points of two lines being connected to one another when they share the same position in space. In this type of network, it is possible to enter and leave the roads in any position. The type Network is used to generate the network. It takes as arguments the destination (reference) points, the lines that will be used to represent the network, and a function that computes the distance on the network given the length of the lines and their id. The code below creates a network that reduces the distance within the network by one fifth of the Euclidean distance for paved roads and by half on the others. The attribute "paved" of the table connected to the layer of lines indicates whether the road is paved or not.

network = Network{
    target = communities,
    lines = roads,
    weight = function(distance, cell)
        if cell.STATUS == "paved" then
            return distance / 5
        else
            return distance / 2
        end
    end,
    outside = function(distance) return distance * 4 end
}
gpm = GPM{
    network = network,
    origin = cells
}

gpm:fill{
    strategy = "minimum",
    attribute = "dist",
    copy = "LOCALIDADE"
}

map1 = Map{
    target = cells,
    select = "dist",
    slices = 10,
    min = 0,
    max = 14000,
    color = "YlOrBr"
}

map2 = Map{
    target = cells,
    select = "LOCALIDADE",
    value = {"Palhauzinho", "Santa Rosa", "Garrafao", "Mojui dos Campos"},
    color = "Set1"
}

Figure below shows the polygons drawn with the color of the closest point through the network. There is a current known limitation in the current version of GPM that does not work properly when the entry point on the network for a given point is the start or end of a line segment.

Neighborhood files

ALSO LOAD THE FILE USING LOADNEIGHBORHOOD

Once we have created the GPM through one of the strategies presented above, we can save it in a file, which can be a GAL file (".gal" or ".GAL"), a GWT file (".gal" or ".GWT"), of a GPM file (".gpm") through the function save(). The only argument of this function is the file name to be saved.

gal

The structure of GAL file does not store information about the attributes of the GPM, but only if two objects are neighbors. Furthermore, it does not support neighborhoods between objects of different layers. The first line of the file, as well as in the GPM file, is the header, and the GPM starts in the second line. In the header, we have the following fields:

  1. The character "0" indicating that it is a neighborhood file.
  2. Number of objects of the data;
  3. Name of the layer for which the GPM was created;
  4. Name of the object attribute used as identifier of the objects. The default value is object_id_.

From the second line until the end of file, the relations are represented. The neighborhood of each object is represented in two lines. The first contains:

  1. Unique identifier of the N-th object;
  2. Number of neighbors of the N-th object.

and the second line contains the unique identifier of the neighbors (ID_Neighbor_M) of the N-th object.

0 Num_elements Layer Key_Variable
ID_Object_1 Num_Neighbors
ID_Neighbor_1 ID_Neighbor_2 ... ID_Neighbor_N
...

A small example of a GAL file is shown below.

0 111 farms_cells.shp object_id_
0 4
0 1 2 3
1 4
0 1 2 3
10 4
0 1 2 3
100 4
0 1 2 3

gwt

GWT format also does not support neighborhood objects of different layers. The header of the GWT format is the same of GAL. From the second line until the end of file, it stores one connection by line using the following fields:

  1. Unique identifier of the object N-th object;
  2. M-th neighbor of the N-th object;
  3. Weight (attribute value) of the relation between the N-th object and the M-th neighbor.

The structure is presended below.

0 Num_elements Layer Key_Variable
ID_Object_1 ID_Neighbor_1 Weight_Neighbor_1
ID_Object_1 ID_Neighbor_2 Weight_Neighbor_2
...
ID_Object_1 ID_Neighbor_N Weight_Neighbor_N
ID_Object_2 ID_Neighbor_1 Weight_Neighbor_1
...

An example of a GWT file is shown below. It starts with zero, followed by the number of objects (111), file name and attribute name in the first line. The second lines indicates that the object zero is connected to itself and has weight 5501.95.

0 111 farms_cells.shp object_id_
0 0 5501.9562754449
0 1 6153.7686145953
0 2 10641.235146975
0 3 13929.451501441
1 0 7365.1627505185
1 1 8020.7951384333
1 2 9759.3030416018
1 3 12486.174413104
10 0 5012.8344428588

More informations about the GAL and GWT formats can be found at GeoDa User's Guide or SpaceStat documentation.

gpm

The structure of the GPM file contains a header in the first line, with the following fields:

  1. Number of attributes of the relations. In the GPM, each relation can have several attributes.
  2. Name of the first layer for which the GPM was created. If the data was loaded directly from a shapefile, it will be the name of the file.
  3. Name of the second layer. Connections are established from the first layer to the second one. If the GPM was created from a single layer, then the name of the first layer will be repeated in this field.
  4. Names of the GPM attributes.

The second line until the end of the file describe the connections. The neighborhood of each object uses two lines. The first one contains:

  1. Unique identifier of the N-th object;
  2. Number of neighbors of the N-th object.

The second line contais the neighborhood of the object which ID is in the previous line, represented by the fields below.

  1. M-th neighbor of the N-th object;
  2. Value of the k-th attribute of the M-th neighbor.

The structure of a GPM file is summarized below.

Num_attributes Layer_1 Layer_2 Attribute_1 Attribute_2 ... Attribute_N
ID_Object_1 Num_Neighbors
ID_Neighbor_1 Attrib_1_Neigh_1 Attrib_2_Neigh_1 ... Attrib_N_Neigh_1 ID_Neighbor_2 ...
ID_Object_2 ...
...

An example of a GPM file with one attribute is shown below.

111 farms_cells.shp communities.shp object_id_
0 4
0 5501.9562754449 1 6153.7686145953 2 10641.235146975 3 13929.451501441
1 4
0 7365.1627505185 1 8020.7951384333 2 9759.3030416018 3 12486.174413104
10 4
0 5012.8344428588 1 3069.7189628308 2 9426.2638737062 3 15346.631146027
Clone this wiki locally