-
Notifications
You must be signed in to change notification settings - Fork 6
/
04-newsvar.Rmd
158 lines (105 loc) · 5.46 KB
/
04-newsvar.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
# Calculate Spatial Metrics
While we've generated some nice visualizations, we need insights quantified as metrics at the neighborhood level. Don't start this step until you have a good idea of what you need. Visualizing and exploring the data in depth is best practice.
For our purposes, we're interested in developing spatial access metrics with a container method approach. At the end of this tutorial, we'll generate the following new variables:
- Total number of Methadone Maintenance MOUD by zip code
- Total number of Walkble MOUD Service Areas by zip code
Plus, we will have a new spatial layer, that includes the actual service areas (ie. 1-mile buffers of MOUDs). We assume that access to MOUDs is critical and requires high regularity, and that walking is the most likely option during COVID. This guides the parameter specification of MOUD Service Areas (and is also backed up by some literature in this space, though much more is needed.)
## Load Spatial Data
Let's first reload our spatial data -- this will be the MOUD points, plus the master zip code spatial file.
```{r warning = FALSE}
#library(sf)
#library(tmap)
points <- st_read("data/methadoneMOUD.geojson")
areas <- st_read("data/geo_export_54bc15d8-5ef5-40e4-8f72-bb0c6dbac9a5.shp")
dim(points)
dim(areas)
head(points)
head(areas)
```
## Transform Projections
First we need to switch to a projection that uses distance in feet or meters as a metric. We'll use EPSG:3435 from the first tutorial. To find which EPSG was recommended, I searched "EPSG Illinois feet" and EPSG:3435 came up as a viable candidate. So, we use that for our new, projected CRS.
```{r}
areas <- st_transform(areas, 3435)
points <- st_transform(points, 3435)
```
We may want to once again confirm they are plotting correctly:
```{r}
tm_shape(areas) + tm_polygons() +
tm_shape(points) + tm_dots(size = 1)
```
## Count resources by area
One way of understanding resource inequity is by thinking about how many resources exist in a neighborhood.
First, give the points the attributes of the polygons they are in. Inspect.
```{r}
pipr <- st_join(points, areas)
head(pipr)
```
You should have the same number of rows in `pipr` as you do in `points`. If not, there is something off. You may need to go back to troubleshoot. In an earlier version of this lab, I used a saved, written `geojson` file of the zip codes, and it would not render properly. Here, we load in the original shapefile at the beginning of the tutorial to avoid that error.
```{r}
dim(pipr)
dim(points)
dim(areas)
```
### Count # per Area
Next, count the number per area. The frequency should be logical according to the map you made earlier. Sometimes, I've found bugs where the numbers are multipled by some factor; this can be checked by looking at dimension disparities, as noted above.
```{r}
ptcount <- as.data.frame(table(pipr$Zip))
head(ptcount)
```
How could improve on this step if you used `dplyr`?
**Aggregation Tip:** What if you have an attribute value you'd like to aggregate? For example, average units of affordable housing facility by zip?
Try `aggregate(pip$attribute, by = list(pip$geoid), mean)` but build on with a tidy sensibility...
Now we can rename our attributes:
```{r}
names(ptcount) <- c("zip", "MetClnc")
head(ptcount)
```
And finally, merge back to your master zip file:
```{r}
head(areas)
areas<- merge(areas, ptcount, by="zip", all = TRUE)
head(areas)
```
Quickly map to inspect:
```{r}
tm_shape(areas) + tm_polygons(col = "gray80") +
tm_shape(areas) + tm_polygons(col = "MetClnc", style = "pretty", alpha = 0.8) +
tm_shape(points) + tm_dots(size = 0.5)
```
## Buffer Data
Next, lets create a walkable buffer of one mile, or 5280 feet, for our MOUD provider locations. Individuals residing in places outside of that walkable area may have difficulty accessing this medication during crises, like a pandemic.
```{r}
# Create 1mile buffers for each point
ptbuffers <- st_buffer(points, 5280)
head(ptbuffers)
```
Inspect immediately:
```{r}
tm_shape(areas) + tm_borders() +
tm_shape(ptbuffers) + tm_borders(col = "blue")
```
## Count buffers by area
We know that MOUD locations are accessible up to one mile away. So, a total count of resources by area may be too restrictive. Let's calculate how many *walkable* MOUD clinics are in each zip code. Or, how many buffers are in each area...
```{r}
bufferct <- lengths(st_intersects(areas, ptbuffers))
head(bufferct)
```
Stick buffer totals back to the zip master file:
```{r}
# Stick buffer totals back to the census master file
areas <- cbind(areas,bufferct)
head(areas)
```
Map density of buffers per census area:
```{r}
tm_shape(areas) + tm_polygons(col = "bufferct", palette = "BuGn", n=5, style = "jenks") +
tm_shape(ptbuffers) + tm_fill(col = "gray90", alpha=0.6) +
tm_shape(points) + tm_dots(col = "gray10", size = 0.3)
```
## Integrate & Explore
Let's review: our master area file now has total number resources by zip and total number of walkable service areas by zip.
Using your new spatial file, see if you can answer some of these quetions using various queries:
- Which zip codes have high rates of COVID and are not within a walking distance of a methadone MOUD?
- Which zip codes have worse access to affordable rental units, low educational rates, and less walkable access to MOUDs?
- What is the demographic and racial/ethnic characteristics of areas most vulnerable to high COVID rates in September 2020?
Generate different maps and outputs to drive your thinking and defend your hypothesis formation.