-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.html
276 lines (275 loc) · 40.7 KB
/
README.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
<h1>Preferential Sampling with moving monitoring stations</h1>
<p>This repository is a companion resource to the paper “Modelling ocean temperatures from bio-probes under preferential sampling” (submitted), by Daniel Dinsdale and Matias Salibian-Barrera and it contains code to illustrate how to apply the methods discussed in that paper.</p>
<h2>Introduction</h2>
<p>Below we illustrate how to model data obtained from sensor tags mounted on marine mammals which may have been preferentially sampled. The <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">R</code> code used below in this document can also be found in a single file here: <a href="runfile.R">runfile.R</a>. The required functions are in the file <a href="dataFncs.R">dataFncs.R</a>, and the negative log-likelihood function required by <a href="https://cran.r-project.org/package=TMB">TMB</a> is in the file <a href="TMBfile.cpp">TMBfile.cpp</a>.</p>
<p>The example below follows the simulation settings discussed in Section 4 of the paper and uses the Preferential-CRW model to generate preferentially sampled animal tracks and their corresponding temperature observations. We then compare parameter estimation and prediction using standard methods and the preferential model in TMB.</p>
<h5>Warning</h5>
<p>Note that running this code may require a large amount of RAM (we recommend 16GB).</p>
<h2>Generating a data set</h2>
<p>First we source the file <a href="dataFncs.R">dataFncs.R</a> which contains the necessary functions to generate the data.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar">source('dataFncs.R')</span></span></pre>
<p>Now we specify the parameters to generate the data set. These can be altered to vary the properties of the latent field to be sampled and also to change the movement patterns of the sampler. Refer to the paper for more details on the model.</p>
<p>First we set the parameters of the assumed Matern covariance function and the (constant) mean of the underlying random field:</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># constant mean</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">mean <- 5</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># scale (range)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">phi <- 25<span class="trailing-whitespace"> </span></span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># nugget (measurement) variance</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">nugget <- 0.1</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># smoothness (assumed known in estimation)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">kappa <- 2</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># marginal variance (partial sill)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">GPVar <- 1.5</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># define the covariance model</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">model <- RMwhittle(nu=kappa, var=GPVar, scale=phi)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># finally trend = 0 just means our mean trend is constant (mean from above)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">trend <- 0</span></span></pre>
<p>Next we specify the parameters that determine the movement/sampler properties. Please refer to the paper for more details on the model and its parameters.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># is starting location random? (0 = yes and >0 multivariate normal with</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar"># mean 0 and diagonal covariance matrix with variance=start)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">start <- 0</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># alpha[1] defines starting value of beta_1 from eq (3.6)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># alpha[2:3] are currently both equal to \alpha from eq (3.7). They could be changed to</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># adopt preferential movement varying in strength across latitude/longitude.</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">alpha <- c(-1.5, 100, 100)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># the number of tracks in the simulation</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">numTracks <- 3</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># the number of data points to simulate per track</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">n <- 360</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># the number of observations to throw out per track (ie/ total sample</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># size per track is n-burnIn)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">burnIn <- 60</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># the rate parameter of the exponential distribution used to generate the sampling times</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">timing <- 10</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># measurement location noise (currently not included in models)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">noise <- 0</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># movement parameters</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># behaviour (beta) standard deviation parameter (\sigma_{\beta} in eq (3.6))</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">behavSD <- .1<span class="trailing-whitespace"> </span></span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># movement standard deviation parameter (diag(\Sigma) in eq (3.3))</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">moveSD <- 3<span class="trailing-whitespace"> </span></span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># combine standard deviations for later use</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">dataParam <- c(behavSD, moveSD)</span></span></pre>
<p>We now create a spatial grid (lattice) for data simulation and also for model fitting/predictions. These can be the same if <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">nrowcol = l</code> (the latter is a lowercase letter <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">L</code>) but we can choose different grid sizes for computational efficiency.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># define the domain to simulate data</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar"># how man rows/columns</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">nrowcol <- 51</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">x <- seq(-150, 150, length.out=nrowcol)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">y <- seq(-150, 150, length.out=nrowcol)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># simulation grid</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">gridFull <- expand.grid(x,y)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"></span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># l is the number of grid cells in each direction across our</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># grid going from -150 to 150 degrees lat and lon for our model fitting</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># and prediction.</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">l <- 26</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">xseq <- (seq(-150, 150, length.out=l))</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">yseq <- (seq(-150, 150, length.out=l))</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># create the prediction lattice</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">lattice <- expand.grid(xseq,yseq)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">colnames(lattice) <- c("Y1New", "Y2New")</span></span></pre>
<!-- ```{r initiate, include=FALSE} -->
<!-- # initiate objects -->
<!-- # nonPrefParams <- NULL -->
<!-- # prefParams <- NULL -->
<!-- # postBias <- NULL -->
<!-- # krigBias <- NULL -->
<!-- # nonPrefBias <- NULL -->
<!-- # postIGN <- NULL -->
<!-- # krigIGN <- NULL -->
<!-- # nonPrefIGN <- NULL -->
<!-- # nonPrefParams <- array(NA, dim=c(1, 4)) -->
<!-- # prefParams <- array(NA, dim=c(1, 8)) -->
<!-- ``` -->
<p>Now we are ready to generate the data. We first simulate the latent field and then run the sampler using the function <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">genPrefDataHybridBehav</code> which can be found in <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">dataFncs.R</code>. From here we will extract the data and the so-called true surface on the lattice and observed locations.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># simulate the random field</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">set.seed(6351)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">rawDat <- RFsimulate(model, x=as.matrix(gridFull), exactness=TRUE)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># simulate the observations and sampling locations</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">Xsim <- genPrefDataHybridBehav(n=n, movementParam=dataParam, nrowcol=nrowcol, m=0,<span class="trailing-whitespace"> </span></span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>paramGP=c(mean, phi, nugget, kappa, GPVar), numTracks=numTracks,<span class="trailing-whitespace"> </span></span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>alpha=alpha, rawDat=rawDat, start=start, burnIn = burnIn, timing=timing)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># extract sampling data</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">data <- Xsim$Dat</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># extract true surface data</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">surface <- Xsim$surface</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">colnames(data) <- c("Time", "Lon", "Lat", "Temp", "gradientX", "gradientY", "Beta", "Track")</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># here is how the data (locations and respective latent field measurements)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">head(data)</span></span></pre>
<pre class="editor-colors lang-text"><span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## Time Lon Lat Temp gradientX gradientY Beta</span></span></span>
<span class=""><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 1 0.00000000 90.62477 83.41096 4.222824 0.01650040 -0.02259048 -1.217568</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 2 0.03895239 90.22061 82.80155 4.167551 0.01499146 -0.02558380 -1.255394</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 3 0.08395329 89.59377 82.23768 4.119081 0.01388511 -0.02918536 -1.267580</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 4 0.11661354 88.57083 82.05452 3.890522 0.01444768 -0.03325392 -1.257057</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 5 0.29282906 84.62312 80.91155 3.690016 0.01259540 -0.04931482 -1.254136</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 6 0.32905096 85.08989 80.72604 3.703261 0.01180773 -0.04797840 -1.273623</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## Track</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 1 1</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 2 1</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 3 1</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 4 1</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 5 1</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## 6 1</span></span></span></pre>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># now we thin the data to 300 locations in total for analysis<span class="trailing-whitespace"> </span></span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">selection <- seq(1, nrow(data), length.out = 300)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">dataThin <- data[selection, ]</span></span></pre>
<p>Here is how the data looks. Each colour is a different track and dots are sampling locations which are superimposed onto the unknown latent field. Note, that this is the same data from Fig 2.</p>
<p><img src="README_files/figure-markdown_github/plotdata-1.png" alt=""></p>
<p>Now we compile the file <a href="TMBfile.cpp">TMBfile.cpp</a> to use in TMB. This file contains the negative joint log-likelihood function -log([<em>X</em>, <em>Y</em>, <em>S</em>]). Note that you must have installed the <a href="https://cran.r-project.org/package=TMB">TMB</a> <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">R</code> package from CRAN.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar">compile("TMBfile.cpp")</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">dyn.load(dynlib("TMBfile"))</span></span></pre>
<p>Next is some house keeping to prepare the data for TMB (refer to the comments inside the script below for details):</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># replace data with thinned version</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">data=dataThin</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># obtain sampling times</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">tsim <- data[,1]</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># number of observations in total</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">numObs <- nrow(data)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># Generate random measurements</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># create trackID which records when new tracks start in the dataframe</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">trackLength <- NULL</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">trackId <- 0</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">for(i in 1:numTracks){</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>trackLength <- c(trackLength, length(which(data$Track==i)))</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>trackId <- c(trackId, sum(trackLength))</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">}</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># create a set of locations which allows for gradients to be calculated in cpp file</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">Yobs <- data$Temp</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">Y1New <- data[,2]</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">Y2New <- data[,3]</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">for(i in 1:length(data[,1])){</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>Y1New <- c(Y1New,data[i,2]+.5, data[i,2])</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>Y2New <- c(Y2New,data[i,3], data[i,3]+.5)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">}</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># combine prediction lattice with sampling locations and gradient locations</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">predGrid <- rbind(cbind(Y1New, Y2New), lattice)</span></span></pre>
<p>Next we create a mesh using <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">inla.mesh.create</code> to use the SPDE approach of <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">R-INLA</code>. We mush be careful to specify an index that matches sampling locations with mesh locations, but also change indexing for use in the <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">C++</code> code.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># create INLA mesh</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">mesh <- inla.mesh.create(loc = predGrid, extend = T, refine = T)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># now create an index that matches sampling locations with mesh locations</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">ii0 <- mesh$idx$loc</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># Create data for TMB</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">dataTMB <- list(tsim=tsim,Y1=Y1New, Y2=Y2New, Y=Yobs, trackId=trackId, meshidxloc=mesh$idx$loc-1)</span></span></pre>
<p>Now we will create our sparse precision matrix for smoothness (<em>?</em>) 2, which enables the field to be differentiable (in mean square sense). For details on this part see Appendix A.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># using SPDE method from R-INLA with alpha=2 (kappa=1)</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">dataTMB$spde <- (inla.spde2.matern(mesh, alpha=2)$param.inla)[c("M0","M1","M2")]</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># create our own sparse precision matrix for alpha=3 (kappa=2)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">M0 <- (inla.spde2.matern(mesh, alpha=2)$param.inla)["M0"]$M0</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">M1 <- (inla.spde2.matern(mesh, alpha=2)$param.inla)["M1"]$M1</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">M2 <- (inla.spde2.matern(mesh, alpha=2)$param.inla)["M2"]$M2</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">M3 <- M2%*%solve(M0)%*%M1</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">M3 <- as(M3, "dgTMatrix")</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">dataTMB$spde[["M3"]]<- M3</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># number of rows in SPDE object</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">n_s = nrow(dataTMB$spde$M0)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># vector of 1's used in TMB (this should be updated)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">dataTMB$Ind <- rep(1, n_s)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># create geodata object</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">obj1 <- cbind(cbind(dataTMB$Y1, dataTMB$Y2)[1:length(dataTMB$Y),], dataTMB$Y)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">geodata <- as.geodata(obj1, coords.col = 1:2, data.col = 3)</span></span></pre>
<h2>Parameter Estimation</h2>
<p>Time to fit some models! First let us fit a standard model using <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">likfit</code> from the pacakge <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">geoR</code>. This approach ignores any preferential effect and works conditional on the observed sampling locations <strong>X</strong>.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar">standardMLE <- likfit(geodata, coords = geodata$coords, data = geodata$data, kappa=kappa, ini=c(.5,.5))</span></span></pre>
<pre class="editor-colors lang-text"><span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## ---------------------------------------------------------------</span></span></span>
<span class=""><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## likfit: likelihood maximisation using the function optim.</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## likfit: Use control() to pass additional</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## arguments for the maximisation function.</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## For further details see documentation for optim.</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## likfit: It is highly advisable to run this function several</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## times with different initial values for the parameters.</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## likfit: <span class="syntax--storage syntax--type syntax--class syntax--warning">WARNING</span>: This step can be time demanding!</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## ---------------------------------------------------------------</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## likfit: end of numerical maximisation.</span></span></span></pre>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar">(standardMLE)</span></span></pre>
<pre class="editor-colors lang-text"><span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## likfit: estimated model parameters:</span></span></span>
<span class=""><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## beta tausq sigmasq phi<span class="trailing-whitespace"> </span></span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## " 4.3058" " 0.1093" " 0.7261" "18.5567"<span class="trailing-whitespace"> </span></span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## Practical Range with cor=0.05 for asymptotic range: 99.6194</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">##<span class="trailing-whitespace"> </span></span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## likfit: maximised log-likelihood = -170.3</span></span></span></pre>
<p>Next we will fit the model in <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">TMB</code>. First we define the parameters for the model (including latent states). Our latent states are the field <strong>S</strong> and behavioural states <strong>beta</strong>. The call to <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">MakeADFun</code> creates the likelihood function, which is then optimized numerically using <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">nlminb</code> (but other general-purpose optimization functions, e.g. <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">optim</code>, can also be considered).</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar">parameters <- list(</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>S = rep(0, n_s),</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>beta = rep(0, length(dataTMB$Y)),</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>mu = standardMLE$beta,</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>log_papertau = 3,</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>log_kappa = log(1/standardMLE$phi),</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>alpha = rnorm(1,alpha[2], 0.25),</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>log_d = log(dataParam[2]),</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>log_sdbehav = log(dataParam[1])</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># create TMB object (note= random=c("S", "beta") to</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># integrate out random field and latent behvaiour states)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">obj <- MakeADFun(dataTMB, parameters, random=c("S", "beta"), DLL="TMBfile", method = "nlminb", hessian=FALSE, silent=T)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># conduct maximisation</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">opt <- try( nlminb(obj$par,obj$fn,obj$gr, control=list(rel.tol=1e-7)) )</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># rerun up to 4 times in case of any gradient errors</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">for(m in 1:4){</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>if(class(opt) != 'try-error' && opt$convergence == 0) {</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>print("Success!")</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>}</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>else{<span class="trailing-whitespace"> </span></span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>paste0("Failed, try number ", m)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>lengthPar <- length(obj$env$last.par.best)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>tmp <- obj$env$last.par.best[(lengthPar-5):lengthPar] + 0.01</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>opt <- try(nlminb(tmp,obj$fn,obj$gr, control=list(rel.tol=1e-7)))</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>}</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">}</span></span></pre>
<pre class="editor-colors lang-text"><span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## [1] "Success!"</span></span></span>
<span class=""><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## [1] "Success!"</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## [1] "Success!"</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## [1] "Success!"</span></span></span></pre>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># Extract sigma^2 (partial sill)</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">report_spde <- obj$report()</span></span></pre>
<p>It is always good practice to verify that the optimization iterations have converged:</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># check convergence</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">opt$convergence</span></span></pre>
<pre class="editor-colors lang-text"><span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## [1] 0</span></span></span></pre>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># Obtain the standard errors</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">sdre <- try( sdreport(obj) )</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">if( class(sdre) != 'try-error') {</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span># input params</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>summary(sdre, "fixed")</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">}</span></span></pre>
<pre class="editor-colors lang-text"><span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## Estimate Std. Error</span></span></span>
<span class=""><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## mu 4.406319 0.24327250</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## log_papertau 4.215612 0.21093490</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## log_kappa -2.830299 0.15373855</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## alpha 142.870640 25.55025060</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## log_d 2.209733 0.03291732</span></span></span>
<span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## log_sdbehav -2.733629 1.25336438</span></span></span></pre>
<h2>Prediction</h2>
<p>We have obtained parameter estimates for the standard method and for the preferential model using <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">TMB</code>. To predict using the non-preferential model we will use kriging with plug-in parameters obtained from the standard <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">likfit</code> function. For the preferential model we use the mode of the <strong>[S|Y,X]</strong> at the optimal parameters. This is provided by <code style="font-family: Menlo, Consolas, "DejaVu Sans Mono", monospace;">TMB</code> as part of the Laplace approximation procedure and is defined in eq (2.7) of the paper.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># conduct simple kriging using standard MLE plug-in parameters</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">SKDat <- krige.control(obj.model = standardMLE, type.krige = "SK")</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># now predict at the prediction "lattice" locations where signal=T is used</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># to specify that there was measurement error on our data observations</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">nonPredPref <- krige.conv(geodata, loc = lattice, krige = SKDat, output=list(signal=T))</span></span></pre>
<pre class="editor-colors lang-text"><span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## krige.conv: model with constant mean</span></span></span>
<span class=""><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## krige.conv: Kriging performed using global neighbourhood</span></span></span></pre>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># finally we obtain preferential model field prediction from mode of [S|Y,X]</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">modePred <- obj$env$last.par.best[(length(Y1New)+1):(length(Y1New)+nrow(lattice))]</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># non-pref predictions</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">nonPrefPred <- nonPredPref$predict</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># non-pref prediction variance</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">nonPrefVar <- nonPredPref$krige.var</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar"># prediction variance from TMB</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">predVar <- (summary(sdre, "random")[(length(Y1New)+1):(length(Y1New)+nrow(lattice)),2])^2</span></span></pre>
<p>Next we want to be able to compare these predictions to the real values of the field at the prediction points.</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar"># obtain real data on prediction lattice</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar"># match indicies of full grid used to simulate data and prediction lattice</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">matchedIndic <- row.match(lattice,gridFull)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">rawDatSmall <- rawDat$variable1[matchedIndic] + mean</span></span></pre>
<p>Now let us calculate the mean ignorance score for each method on this data set (MIGN from eq (4.2) in the paper). Recall that the ignorance function (IGN) is given by</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar">IGN <- function(pred, act, var) {</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar"><span class="leading-whitespace"> </span>((pred - act)^2) / var + log(var)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">}</span></span></pre>
<p>Then the MIGN can be computed as follows:</p>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar">IgnScorePost <- IGN(modePred, rawDatSmall, predVar)</span></span>
<span class=""><span class="syntax--text syntax--plain syntax--null-grammar">IgnScoreNonPref <- IGN(nonPredPref$predict, rawDatSmall, nonPredPref$krige.var)</span></span>
<span><span class="syntax--text syntax--plain syntax--null-grammar">mean(IgnScorePost)</span></span></pre>
<pre class="editor-colors lang-text"><span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## [1] -0.5294663</span></span></span></pre>
<pre class="editor-colors lang-r"><span><span class="syntax--text syntax--plain syntax--null-grammar">mean(IgnScoreNonPref)</span></span></pre>
<pre class="editor-colors lang-text"><span><span class="syntax--text syntax--plain"><span class="syntax--meta syntax--paragraph syntax--text">## [1] -0.3737076</span></span></span></pre>
<p>Finally we can plot the IGN scores and compare predictive surfaces from the non-preferential and preferential models. We consider only prediction locations in regions near the sampling locations: <img src="README_files/figure-markdown_github/showign-1.png" alt=""><img src="README_files/figure-markdown_github/showign-2.png" alt=""><img src="README_files/figure-markdown_github/showign-3.png" alt=""></p>
<p>Note that the mean IGN for the following two plots are -0.96 (TMB) and -0.88 (kriging) respectively. Comparing this to Fig 5 (b), this simulation shows a relatively small improvement by the preferential model compared to most simulations with these parameters, mainly due to the large coverage of the field by the data locations. <img src="README_files/figure-markdown_github/plotign2-1.png" alt=""><img src="README_files/figure-markdown_github/plotign2-2.png" alt=""></p>