04-newsvar.Rmd

# Calculate Spatial Metrics

While we've generated some nice visualizations, we need insights quantified as metrics at the neighborhood level. Don't start this step until you have a good idea of what you need. Visualizing and exploring the data in depth is best practice.

For our purposes, we're interested in developing spatial access metrics with a container method approach. At the end of this tutorial, we'll generate the following new variables:

- Total number of Methadone Maintenance MOUD by zip code
- Total number of Walkble MOUD Service Areas by zip code

Plus, we will have a new spatial layer, that includes the actual service areas (ie. 1-mile buffers of MOUDs). We assume that access to MOUDs is critical and requires high regularity, and that walking is the most likely option during COVID. This guides the parameter specification of MOUD Service Areas (and is also backed up by some literature in this space, though much more is needed.)

## Load Spatial Data

Let's first reload our spatial data -- this will be the MOUD points, plus the master zip code spatial file.

```{r warning = FALSE}
#library(sf)
#library(tmap)

points <- st_read("data/methadoneMOUD.geojson")
areas <- st_read("data/geo_export_54bc15d8-5ef5-40e4-8f72-bb0c6dbac9a5.shp")

dim(points)
dim(areas)

head(points)
head(areas)

```

## Transform Projections

First we need to switch to a projection that uses distance in feet or meters as a metric. We'll use EPSG:3435 from the first tutorial. To find which EPSG was recommended, I searched "EPSG Illinois feet" and EPSG:3435 came up as a viable candidate. So, we use that for our new, projected CRS.

```{r}

areas <- st_transform(areas, 3435)
points <- st_transform(points, 3435)

```

We may want to once again confirm they are plotting correctly:

```{r}
tm_shape(areas) + tm_polygons() +
  tm_shape(points) + tm_dots(size = 1)
```

## Count resources by area

One way of understanding resource inequity is by thinking about how many resources exist in a neighborhood.

First, give the points the attributes of the polygons they are in. Inspect.

```{r}
pipr <- st_join(points, areas)
head(pipr)
```

You should have the same number of rows in `pipr` as you do in `points`. If not, there is something off. You may need to go back to troubleshoot. In an earlier version of this lab, I used a saved, written `geojson` file of the zip codes, and it would not render properly. Here, we load in the original shapefile at the beginning of the tutorial to avoid that error.

```{r}
dim(pipr)
dim(points)
dim(areas)
```

### Count # per Area
Next, count the number per area. The frequency should be logical according to the map you made earlier. Sometimes, I've found bugs where the numbers are multipled by some factor; this can be checked by looking at dimension disparities, as noted above.

```{r}
ptcount <- as.data.frame(table(pipr$Zip))
head(ptcount)
```
How could improve on this step if you used `dplyr`? 

**Aggregation Tip:** What if you have an attribute value you'd like to aggregate? For example, average units of affordable housing facility by zip? 
Try `aggregate(pip$attribute, by = list(pip$geoid), mean)` but build on with a tidy sensibility...

Now we can rename our attributes:

```{r}
names(ptcount) <- c("zip", "MetClnc")
head(ptcount)
```

And finally, merge back to your master zip file:

```{r}
head(areas)
areas<- merge(areas, ptcount, by="zip", all = TRUE)
head(areas)
```

Quickly map to inspect:
```{r}
tm_shape(areas) + tm_polygons(col = "gray80") +
tm_shape(areas) + tm_polygons(col = "MetClnc", style = "pretty", alpha = 0.8) +   
  tm_shape(points) + tm_dots(size = 0.5) 
```


## Buffer Data
Next, lets create a walkable buffer of one mile, or 5280 feet, for our MOUD provider locations. Individuals residing in places outside of that walkable area may have difficulty accessing this medication during crises, like a pandemic.

```{r}
# Create 1mile buffers for each point
ptbuffers <- st_buffer(points, 5280)
head(ptbuffers)
```

Inspect immediately:

```{r}
tm_shape(areas) + tm_borders() +
  tm_shape(ptbuffers) + tm_borders(col = "blue") 
```


## Count buffers by area
We know that MOUD locations are accessible up to one mile away. So, a total count of resources by area may be too restrictive. Let's calculate how many *walkable* MOUD clinics are in each zip code. Or, how many buffers are in each area...

```{r}
bufferct <- lengths(st_intersects(areas, ptbuffers))
head(bufferct)
```

Stick buffer totals back to the zip master file:

```{r}
# Stick buffer totals back to the census master file
areas <- cbind(areas,bufferct)
head(areas)
```

Map density of buffers per census area:

```{r}
tm_shape(areas) + tm_polygons(col = "bufferct", palette = "BuGn", n=5, style = "jenks") + 
    tm_shape(ptbuffers) + tm_fill(col = "gray90", alpha=0.6) +
    tm_shape(points) + tm_dots(col = "gray10", size = 0.3) 
```

## Integrate & Explore

Let's review: our master area file now has total number resources by zip and total number of walkable service areas by zip. 

Using your new spatial file, see if you can answer some of these quetions using various queries:

- Which zip codes have high rates of COVID and are not within a walking distance of a methadone MOUD?

- Which zip codes have worse access to affordable rental units, low educational rates, and less walkable access to MOUDs? 

- What is the demographic and racial/ethnic characteristics of areas most vulnerable to high COVID rates in September 2020? 

Generate different maps and outputs to drive your thinking and defend your hypothesis formation.