-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
234 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,218 @@ | ||
--- | ||
title: "Example 1: Simple" | ||
author: "David Hodgson" | ||
date: "`r Sys.Date()`" | ||
output: rmarkdown::html_vignette | ||
vignette: > | ||
%\VignetteIndexEntry{Example 1: Simple} | ||
%\VignetteEngine{knitr::rmarkdown} | ||
%\VignetteEncoding{UTF-8} | ||
--- | ||
|
||
```{r include=FALSE} | ||
knitr::opts_chunk$set(echo = TRUE) | ||
``` | ||
|
||
```{r setup} | ||
#devtools::install("..") #install if needed | ||
library(ptmc) | ||
library(tidyverse) | ||
library(coda) | ||
``` | ||
|
||
# Parallel Tempering Monte Carlo | ||
|
||
## Model | ||
|
||
The example model is taken from Ben Lambert's fantastic book, "A Student's Guide to Bayesian Statistics," [Questions 13.3](https://benlambertdotcom.files.wordpress.com/2018/08/bayesianbook_problemsanswers_final.pdf#page=124). Briefly, I assume the random count variable $X_t$, the number of mosquitos caught during day $t$, is Poisson, such that | ||
|
||
$$ | ||
\begin{split} | ||
X_t &\sim \textit{Poisson}(\lambda)\\ | ||
\lambda &= 1000 \times \exp(-\mu t)\psi | ||
\end{split} | ||
$$ | ||
|
||
where $\mu$ is a constant mortality hazard rate and $\psi$ is the daily recapture probability with prior distributions of $\mu \sim \Gamma(2,20)$ and $\psi \sim \textit{Beta}(2, 40)$. The data $x_t$, the number of mosquitos caught on day $t$, are given in the RWM_mosquito.csv file but can also be found [here](https://benlambertdotcom.files.wordpress.com/2018/08/all_data.zip). | ||
|
||
|
||
## Implementation | ||
|
||
### Create the model | ||
The `model` list should be a list with three functions, `samplePriorDistributions`, `evaluateLogLikelihood`, and `evaluateLogPrior` and a vector string with the parameter names used in the calibration, `namesOfParameters`. | ||
|
||
#### vector string `par_names` | ||
A vector string of the parameter names. | ||
|
||
#### func `samplePriorDistributions` | ||
A function which generates the initial values (usually by sampling the prior distributions) | ||
|
||
* (no arguments) | ||
* @return vector of values | ||
|
||
#### func `evaluateLogLikelihood` | ||
A function which generates the log-likelihood for a set of parameter values | ||
|
||
* @param data, data needed to calculate the log-likelihood | ||
* @param params, parameter values | ||
* @return log-likelihood | ||
|
||
#### func `evaluateLogPrior` | ||
A function which calculates the log-prior for a set of parameter values | ||
|
||
* @param params, parameter values | ||
* @return log-prior | ||
|
||
|
||
```{r model definition} | ||
# model is a list of three functions and a vector string | ||
model <- list( | ||
lowerParSupport_fitted = c(0, 0), | ||
upperParSupport_fitted = c(10, 1), | ||
namesOfParameters = c("mu","psi"), | ||
samplePriorDistributions = function(datalist) { | ||
c(rgamma(1, 2, 20), rbeta(1, 2, 40)) | ||
}, | ||
evaluateLogPrior = function(params, datalist) { | ||
if (params[1] < 0 || params[2] < 0 || params[1] > 1 || params[2] > 1) | ||
return(-Inf) | ||
lpr <- 0 | ||
lpr <- lpr + log(pgamma( params[1], 2,20)) | ||
lpr <- lpr + log(pbeta( params[2], 2,40)) | ||
lpr | ||
}, | ||
evaluateLogLikelihood = function(params, covariance, datalist) { | ||
y <- datalist$y | ||
mu <- params[1] | ||
psi <- params[2] | ||
ll <- 0; | ||
for (t in 1:length(datalist$y)) | ||
{ | ||
sum_x = 0 | ||
lambda = 1000*exp(-mu*(t+1))*psi; | ||
for (i in 1:y[t]){ | ||
sum_x <- sum_x + i | ||
} | ||
ll <- ll - lambda + y[t]*log(lambda) - sum_x | ||
} | ||
ll | ||
} | ||
) | ||
``` | ||
|
||
|
||
### Obtain the data | ||
Data used in the log-likelihood function can be in any format. | ||
|
||
```{r data import} | ||
dataMos <- read.csv("RWM_mosquito.csv") | ||
# restructure the data for the log-likelihood function | ||
data_t <- list( | ||
time = dataMos$time, | ||
y = dataMos$recaptured | ||
) | ||
``` | ||
|
||
### Settings | ||
The settings for the parallel tempering algorithm are summarised here. | ||
|
||
#### Settings Options | ||
|
||
* numberChainRuns, number of independent chains to run | ||
* numberTempChains, number of dependent chains per chain run (i.e. the number of rungs in the temperature ladder) | ||
* iterations, the number of steps to take in the Markov chain, (including the burn-in) | ||
* burninPosterior, the number of steps in the burn-in (these are discarded) | ||
* thin, thinning of the chain (i.e. =10 means only every 10th sample is saved) | ||
* consoleUpdates, frequency at which the console updates (i. =100 means every 100th step) | ||
* numberFittedPar, number of parameters | ||
* onAdaptiveCov, whether to include adaptive covariance | ||
* updatesAdaptiveCov, frequency at which the adaptive covariance matrix is updated | ||
* burninAdaptiveCov, number of steps to take before using the adaptive covariance matrix | ||
* onAdaptiveTemp, whether to include adaptive temperature ladder | ||
* updatesAdaptiveTemp, frequency at which the adaptive temperature ladder is updated | ||
* Debug, run with debug output | ||
|
||
```{r settings} | ||
# settings used for the ptmc model | ||
settings <- list( | ||
numberChainRuns = 2, | ||
numberTempChains = 10, | ||
iterations = 10000, | ||
burninPosterior = 5000, | ||
thin = 10, | ||
consoleUpdates = 100, | ||
numberFittedPar = 2, | ||
onAdaptiveCov = TRUE, | ||
updatesAdaptiveCov = 200, | ||
burninAdaptiveCov = 100, | ||
onAdaptiveTemp = TRUE, | ||
updatesAdaptiveTemp = 10, | ||
onDebug = FALSE, | ||
lowerParBounds = model$lowerParSupport_fitted, | ||
upperParBounds = model$upperParSupport_fitted, | ||
covarInitVal = 1e-8, # make very small if struggling to sample to beginning | ||
covarInitValAdapt = 1e-4, # make very small if struggling to sample to beginning | ||
covarMaxVal = 1, # decrease if struggling to sample in the middle | ||
runParallel = TRUE, | ||
numberCores = 2 | ||
) | ||
``` | ||
|
||
### Run the model. | ||
|
||
```{r run model, message=FALSE, results = 'hide'} | ||
post <- ptmc_func(model=model, data=data_t, settings=settings) | ||
``` | ||
|
||
## Plot the data | ||
`ptmc_func` returns a list of length two. The first entry is `post$mcmc` a mcmc or mcmc.list object (from the coda package). I can plot these and calculate convergence diagnostics using coda functions: | ||
|
||
```{r plot outcomes} | ||
library(posterior) | ||
library(coda) | ||
library(bayesplot) | ||
summary(post$mcmc) | ||
post$mcmc %>% mcmc_trace | ||
# Plot the Gelman-Rubin diagnostic for the parameters | ||
gelman.plot(post$mcmc) | ||
gelman.diag(post$mcmc) | ||
``` | ||
|
||
The second entry is `post$lpost` and is long table dataframe of the log-posterior values. These values can be easily plotted using ggplot2: | ||
```{r} | ||
# Plot of the logposterior for the three chains | ||
lpost_conv <- post$lpost %>% filter(sample_no>250) | ||
logpostplot <- ggplot(lpost_conv, aes(x = sample_no, y = lpost)) + | ||
geom_line(aes(color = chain_no), size = 0.2, alpha=0.8) + | ||
theme_minimal() | ||
logpostplot | ||
``` | ||
|
||
The third entry is `post$temp` and is long table dataframe of the adaptive temperature values. These values can be easily plotted using ggplot2: | ||
```{r} | ||
# Plot of the logposterior for the three chains | ||
#temp_conv <- post$temp %>% filter(sample_no>5000) | ||
tempplot <- ggplot(post$temp, aes(x = sample_no, y = temperature)) + | ||
geom_line(aes(color = chain_no), size = 0.2, alpha=0.8) + | ||
theme_minimal() | ||
tempplot | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
time,recaptured | ||
1,39 | ||
2,21 | ||
3,36 | ||
4,34 | ||
5,18 | ||
6,22 | ||
7,43 | ||
8,11 | ||
9,17 | ||
10,9 | ||
11,15 | ||
12,11 | ||
13,13 | ||
14,13 | ||
15,7 |