-
Notifications
You must be signed in to change notification settings - Fork 5
/
trad01-predictors.qmd
125 lines (93 loc) · 6.04 KB
/
trad01-predictors.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
title: "Predictors"
execute:
cache: FALSE
---
## Why predictor variables?
Predictor variables (aka "covariate variables" and "environmental variables") are provided in the downloaded [observations](observations.html#examining-emnbedded-covariates). We can certainly use that data, but to use [MaxEnt](https://biodiversityinformatics.amnh.org/open_source/maxent/) presence only modeling, we need a way to characterize the background. For that we select random points from the same region and times that we have observations. Well, those data do not come with the OBIS download, we have to get them ourselves.
Another nice thing about using external predictor variables is that we can use them, once we have a model in hand, to make predictive maps.
## What predictor variables?
Here we chose to use monthly means of sea surface temperature [OISST](https://www.ncei.noaa.gov/products/optimum-interpolation-sst) and sea winds [NBS v2](https://coastwatch.noaa.gov/cwn/products/noaa-ncei-blended-seawinds-nbs-v2.html).
Each of these are available as easily access on-line monthly datasets. Monthly means seem like a good granularity for this dataset (and a tutorial). It is appealing, but unsubstantiated, that SST and winds would have an influence on *Mola mola*/observer interactions.
## Storing large data and github
Let's make two sub-directories in our `data` directory; one each for OISST and NBS2.
```{r}
source("setup.R")
oisst_path = "data/oisst"
nbs_path = 'data/nbs'
ok = dir.create(oisst_path, recursive = TRUE, showWarnings = FALSE)
ok = dir.create(nbs_path, recursive = TRUE, showWarnings = FALSE)
```
It is not unreasonable to keep your data in your project directory, but gridded data can be quite large. Sometimes it gets so large that the size limit on github repositories can be exceeded. The simplest way to prevent that is to prevent git from tracking your large data directories by adding filters to your local `.gitignore` file. Here's what we put in our project's `.gitignore` file. These lines tell `git` to not catalog items found in those two directories.
```
# Biggish data directories
data/oisst/*
data/nbs/*
```
## Fetching data
### OISST sea surface temperature data
We'll use the [oisster](https://github.com/BigelowLab/oisster) R package to fetch OISST monthly data. Now, OISST is mapped into the longitude range of [0,360], but our own data is mapped into the longitude range [-180,180]. So we need to request a box that fits into OISST's range. The [oisster](https://github.com/BigelowLab/oisster) R package) provides a function to help translate the bounding box. This can a little while.
At the end we build a small database of the files we have saved, which will be very helpful to us later when we need to find and open the files. the `oisster` package provides a number of organizational utilities, such as saving data into a directory hierarchy that works well when storing a mix of data (monthlies, dailies, etc). We don't really leverage these utilities a lot, but what is provided is helpful to us.
First, let's getthe bounding box and transfprm it to a 0-360 longitude range.
```{r}
bb = get_bb(form = "numeric")
bb = oisster::to_360_bb(bb)
```
And now get the rasters. This can take a long time, so we test for the existence of the local NBS database first. If the database doesn't yet exist, then we download the data.
```{r oisst, warning = FALSE, message = FALSE}
oisst_db_file = file.path(oisst_path, "database.csv.gz")
if (file.exists(oisst_db_file)){
sst_db = oisster::read_database(oisst_path)
} else {
sst = oisster::fetch_month(dates = seq(from = as.Date("2000-01-01"),
to = current_month(offset = -1),
by = "month"),
bb = bb,
path = oisst_path)
sst_db = oisster::build_database(oisst_path, save_db = TRUE)
}
sst_db
```
The `ltm` is reserved for use with the long-term-mean products which we do not download. We can select from the database a small number of file to load and view.
```{r oisst-load}
# first we filter
sub_sst_db = sst_db |>
dplyr::filter(dplyr::between(date, as.Date("2000-01-01"), as.Date("2000-12-01")))
# then read into a multi=layer stars object
sst = stars::read_stars(oisster::compose_filename(sub_sst_db, oisst_path),
along = list(date = sub_sst_db$date))
plot(sst)
```
### NBS wind data
Formerly known as "Blended Sea Winds", the NBS v2 data set provides global wind estimates of wind (u and v components as well as windspeed) gong back to 1987. We shall collect monthly raster for our study area from 2000 to close to present (nbs lags by about 2 months). Like OISST, NBS is served on a longitude range of 0-360.
:::{.callout-note}
The NBS dataset for monthly UV has some issues - certain months are not present even though the server says they are. This may generate error messages which we can ignore, but the data for that month will not be downloaded.
:::
This can take a long time, so we test for the existence of the local NBS database first. If the database doesn't yet exist, then we download the data.
```{r nbs, warning = FALSE, message = FALSE}
wind_db_file = file.path(nbs_path, "database.csv.gz")
if (file.exists(wind_db_file)){
wind_db = nbs::read_database(nbs_path)
} else {
wind_db = query_nbs(period = "monthly",
dates = c(as.Date("2000-01-01"), Sys.Date()-75),
product = "uvcomp") |>
fetch_nbs(param = 'all',
bb = bb,
path = nbs_path,
verbose = FALSE) |>
write_database(nbs_path)
}
```
Just as with sst, we can select from the database a small number of file to load and view.
```{r nbs-load}
# first we filter
sub_wind_db = wind_db |>
dplyr::filter(param == "windspeed",
dplyr::between(date, as.Date("2000-01-01"), as.Date("2000-12-01")))
# then read into a multi=layer stars object
wind = stars::read_stars(nbs::compose_filename(sub_wind_db, nbs_path),
along = list(date = sub_wind_db$date)) |>
rlang::set_names("windspeed")
plot(wind)
```