forked from jamiecmontgomery/spatial-analysis-R
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathintro_to_sf.Rmd
236 lines (157 loc) · 6.67 KB
/
intro_to_sf.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
---
title: "Introduction to the sf Package"
author: "Jamie Afflerbach"
output:
html_document:
code_folding: show
toc: yes
toc_depth: 3
toc_float: yes
pdf_document:
toc: yes
---
# Background
From the [**sf**](https://r-spatial.github.io/sf/articles/sf1.html) vignette:
> Simple features or simple feature access refers to a formal standard (ISO 19125-1:2004) that describes how objects in the real world can be represented in computers, with emphasis on the spatial geometry of these objects. It also describes how such objects can be stored in and retrieved from databases, and which geometrical operations should be defined for them.
> The standard is widely implemented in spatial databases (such as PostGIS), commercial GIS (e.g., ESRI ArcGIS) and forms the vector data basis for libraries such as GDAL. A subset of simple features forms the GeoJSON standard.
> R has well-supported classes for storing spatial data (sp) and interfacing to the above mentioned environments (rgdal, rgeos), but has so far lacked a complete implementation of simple features, making conversions at times convoluted, inefficient or incomplete. The package sf tries to fill this gap, and aims at succeeding sp in the long term.
```{r setup, include=FALSE}
knitr::opts_chunk$set(warning=FALSE, message=FALSE)
#install.packages('sf')
library(sf)
#install.packages('rgdal')
library(rgdal)
#install.packages('dplyr')
library(dplyr)
#install.packages('ggplot2')
library(ggplot2)
```
The **sf** package is an R implementation of [Simple Features](https://en.wikipedia.org/wiki/Simple_Features). This package incorporates:
- a new spatial data class system in R
- functions for reading and writing data
- tools for spatial operations on vectors
Most of the functions in this package starts with prefix `st_` which stands for *spatial* and *temporal*.
# Reading a shapefile
If you've used `readOGR()` from the `rgdal` package, you'll notice the similarities in arguments for the `st_read()` function. We'll do a quick comparison of the two functions here.
Read in a shapefile of the US West Coast using `readOGR()` from the `rgdal` package, and `st_read` from the `sf` package.
* `dsn` is the path name
* `layer` is the name of the file
*NOTE: you do not need to add an extension to the layer name*
```{r read_shp_rdgal}
## Read in shapefile using rgdal
system.time(west_shp_rgdal <- readOGR(dsn="shapefiles", layer="wc_regions"))
object.size(west_shp_rgdal)
plot(west_shp_rgdal)
```
```{r read_shp_sf}
## Read in shapefile using sf
system.time(west_shp_sf <- st_read(dsn = "shapefiles", layer="wc_regions")) # dsn=where file is located
west_shp_sf
object.size(west_shp_sf)
plot(west_shp_sf)
plot(west_shp_sf[1])
# using spatial object as a data frame
nrow(west_shp_sf)
ncol(west_shp_sf)
```
You'll notice right away that these two objects are being plotted differently. This is because these two objects are of different types.
**sf** objects usually have two classes - `sf` and `data.frame`. Two main differences comparing to a regular `data.frame` object are spatial metadata (`geometry type`, `dimension`, `bbox`, `epsg (SRID)`, `proj4string`) and additional column - typically named `geom` or `geometry`.
```{r}
class(west_shp_sf)
```
# Attributes
**sf** objects can be used as a regular `data.frame` object in many operations
```{r}
west_shp_sf
nrow(west_shp_sf)
ncol(west_shp_sf)
```
## sf & dplyr
It also easy to use the **dplyr** package on `sf` objects:
`select()`
```{r select}
library(dplyr)
west_shp_sf %>%
select(rgn, rgn_key, geometry)
# west_shp_sf %>%
# dplyr::select(rgn, rgn_key, geometry)
```
`filter()`
```{r filter}
or <- west_shp_sf %>%
filter(rgn == "Oregon")
plot(or[1])
```
`mutate()`
```{r mutate}
sf_out <- west_shp_sf %>%
mutate(rgn_id = c(1:5),
area_km2 = area_m2/1000000) #converting square meters to square kilometers
sf_out
plot(sf_out[5])
```
## Reprojection
- The `st_transform()` can be used to transform coordinates
```{r}
sf <- st_transform(sf_out, crs = '+proj=moll +ellps=WGS84') #reprojecting to a mollweide CRS; st=spatial/temporal
sf
plot(sf[1])
```
# Save
Save the spatial object to disk using `st_write()` and specifying the filename as well as the [driver](http://www.gdal.org/ogr_formats.html).
```{r plot}
st_write(sf_out, "shapefiles/wc_regions_clean.shp", driver = "ESRI Shapefile", delete_layer = TRUE)
```
# Visualize with ggplot
`ggplot2` now has integrated functionality to plot sf objects using `geom_sf()`.
```{r}
#simplest plot
library(devtools)
#devtools::install_github('tidyverse/ggplot2')
library(ggplot2)
ggplot(sf_out) +
geom_sf()
```
This is useful to make sure your file looks correct but doesn't display any information about the data. We can plot these regions and fill each polygon based on the rgn_id.
```{r}
ggplot(sf_out) +
geom_sf(aes(fill = rgn))
```
We can clean it up a bit, applying a cleaner theme and assigning a Spectral color palette.
```{r}
ggplot(sf_out) +
geom_sf(aes(fill = rgn)) +
theme_bw() +
labs(fill = "Region") +
scale_fill_brewer(palette = "Spectral")
```
This might not be as useful, but you can imagine a use case for continuous values, whether with a spatial object with more polygons or different data types.
```{r}
ggplot(sf_out) +
geom_sf(aes(fill = area_km2)) +
theme_bw() +
labs(fill = "Area (km2)") +
scale_fill_distiller(palette = "Blues", direction=1) #direction can be used to reverse the color palette scale
```
# Union
You can merge all polygons into one using `st_union()`. This creates an af
```{r st_union}
full_rgn <- st_union(sf_out)
plot(full_rgn)
#try plotting with ggplot2
# ggplot(full_rgn) +
# geom_sf()
```
This doesn't work because now the object is a single geometry object of class `sfc` which is not a data.frame that ggplot requires for plotting. To turn an `sfc` object back into an `sf` object use `st_sf()`.
```{r st_sf}
full_rgn <- st_union(sf_out) %>%
st_sf(geometry = .)
ggplot(full_rgn) +
geom_sf()
```
There is a lot more functionality to `sf` including the ability to `intersect` polygons, calculate `distance`, create a `buffer`, and more. Here are some more great resources and tutorials for a deeper dive into this great package:
[Spatial analysis in R with the sf package](https://cdn.rawgit.com/rhodyrstats/geospatial_with_sf/bc2b17cf/geospatial_with_sf.html)
[Intro to Spatial Analysis](https://cdn.rawgit.com/Nowosad/Intro_to_spatial_analysis/05676e29/Intro_to_spatial_analysis.html#1)
[sf github repo](https://github.com/r-spatial/sf)
[Tidy spatial data in R: using dplyr, tidyr, and ggplot2 with sf](http://strimas.com/r/tidy-sf/)
[mapping-fall-foliage-with-sf](https://rud.is/b/2017/09/18/mapping-fall-foliage-with-sf/)