From 5c76e4e7c486be89d6421032172a389c52184099 Mon Sep 17 00:00:00 2001 From: Amanda Minter Date: Thu, 21 Dec 2023 14:05:03 +0000 Subject: [PATCH] update late tasks from feedback (#85) * Extend challenge in 'Choosing an appropriate model' * Edit to accounting for uncertainty example Store all results and then extract infectious compartment for plotting * Add detail to Ebola case study challenge * update glossary entries * Add callout on ODE solver * clarify purpose of concept dependencies * Updates from review * Remove infection object * Update simulating-transmission.Rmd * Update model-choices.Rmd * Add exercise to `compare_interventions.Rmd` * update modelling interventions after review * Update renv.lock * update plots * make uncertainty plots consistent across episodes * fix broken link * making plots consistent across tutorials * update summary and key points * update plots and add challenge * update introduction * move contact matrix callout and clarify reduction is performed within the model functions * add callout on intervention types * update introduction and reorder text in other sections * update PI section * style code * spell check * Update compare-interventions.Rmd * Apply suggestions from code review Co-authored-by: Andree Valle Campos * Update episodes/modelling-interventions.Rmd Co-authored-by: Andree Valle Campos * add dropdown menus to make section more interactive * add reminder about transmissibility calculation * link to latent period * update model terms * add note on flow diagram * add callout on model rates * Update episodes/simulating-transmission.Rmd Co-authored-by: Andree Valle Campos * Update episodes/simulating-transmission.Rmd Co-authored-by: Andree Valle Campos * Update episodes/simulating-transmission.Rmd Co-authored-by: Andree Valle Campos * Update episodes/simulating-transmission.Rmd Co-authored-by: Andree Valle Campos * Update episodes/simulating-transmission.Rmd Co-authored-by: Andree Valle Campos * update tutorial objectives * distinguish parameter definitions from process descriptions in flow diagram * Update model-choices.Rmd * remove pak call from set up * Update renv.lock --------- Co-authored-by: Andree Valle Campos --- episodes/compare-interventions.Rmd | 149 +++++++- episodes/model-choices.Rmd | 266 ++++++++++++-- episodes/modelling-interventions.Rmd | 282 +++++++-------- episodes/simulating-transmission.Rmd | 369 ++++++++++---------- learners/reference.md | 33 +- renv/profiles/lesson-requirements/renv.lock | 27 +- 6 files changed, 728 insertions(+), 398 deletions(-) diff --git a/episodes/compare-interventions.Rmd b/episodes/compare-interventions.Rmd index a3c245e9..c8b344a2 100644 --- a/episodes/compare-interventions.Rmd +++ b/episodes/compare-interventions.Rmd @@ -19,7 +19,7 @@ library(epidemics) ::::::::::::::::::::::::::::::::::::: objectives -- Understand how to compare intervention scenarios +- Compare intervention scenarios :::::::::::::::::::::::::::::::::::::::::::::::: @@ -28,7 +28,7 @@ library(epidemics) ## Prerequisites + Complete tutorials [Simulating transmission](../episodes/simulating-transmission.md) and [Modelling interventions](../episodes/modelling-interventions.md) -This tutorial has the following concept dependencies: +Learners should familiarise themselves with following concept dependencies before working through this tutorial: **Outbreak response** : [Intervention types](https://www.cdc.gov/nonpharmaceutical-interventions/). ::::::::::::::::::::::::::::::::: @@ -53,7 +53,7 @@ In this tutorial we introduce the concept of the counter factual and how to comp ## Vacamole model -The Vacamole model is a deterministic model based on a system of ODEs in [Ainslie et al. 2022]( https://doi.org/10.2807/1560-7917.ES.2022.27.44.2101090). The model consists of 11 compartments, individuals are classed as one of the following: +The Vacamole model is a deterministic model based on a system of ODEs in [Ainslie et al. 2022]( https://doi.org/10.2807/1560-7917.ES.2022.27.44.2101090) to describe the effect of vaccination on COVID-19 dynamics. The model consists of 11 compartments, individuals are classed as one of the following: + susceptible, $S$, + partial vaccination ($V_1$), fully vaccination ($V_2$), @@ -119,42 +119,157 @@ DiagrammeR::grViz("digraph{ }") ``` -See `?epidemics::model_vacamole_cpp` for detail on how to run the model. -## Comparing scenarios +::::::::::::::::::::::::::::::::::::: challenge -*Coming soon* +## Running a counterfactual scenario using the Vacamole model -## Challenge +1. Run the model with the default parameter values for the UK population assuming that : -*Coming soon* ++ 1 in a million individual are infectious (and not vaccinated) at the start of the simulation ++ The contact matrix for the United Kingdom has age bins: + + age between 0 and 20 years, + + age between 20 and 40, + + 40 years and over. ++ There is no vaccination scheme in place + +2. Using the output, plot the number of deaths through time + + +::::::::::::::::: hint + +### Vaccination code + +To run the model with no vaccination in place we can *either* create two vaccination objects (one for each dose) using `vaccination()` with the time start, time end and vaccination rate all set to 0, or we can use the `no_vaccination()` function to create a vaccination object for two doses with all values set to 0. + +```{r, eval = FALSE} +no_vaccination <- no_vaccination(population = uk_population, doses = 2) +``` +:::::::::::::::::::::: - +::::::::::::::::: hint - +### HINT : Running the model with default parameter values +We can run the Vacamole model with [default parameter values](https://epiverse-trace.github.io/epidemics/articles/vacamole.html#model-epidemic-using-vacamole) by just specifying the population object and number of time steps to run the model for: - +```{r, eval = FALSE} +output <- model_vacamole_cpp( + population = uk_population, + vaccination = no_vaccination, + time_end = 300 +) +``` + +:::::::::::::::::::::: + + + +::::::::::::::::: solution + +### SOLUTION + +1. Run the model + +```{r} +polymod <- socialmixr::polymod +contact_data <- socialmixr::contact_matrix( + survey = polymod, + countries = "United Kingdom", + age.limits = c(0, 20, 40), + symmetric = TRUE +) +# prepare contact matrix +contact_matrix <- t(contact_data$matrix) + +# extract demography vector +demography_vector <- contact_data$demography$population +names(demography_vector) <- rownames(contact_matrix) + +# prepare initial conditions +initial_i <- 1e-6 + +initial_conditions <- c( + S = 1 - initial_i, + V1 = 0, V2 = 0, + E = 0, EV = 0, + I = initial_i, IV = 0, + H = 0, HV = 0, D = 0, R = 0 +) + +initial_conditions <- rbind( + initial_conditions, + initial_conditions, + initial_conditions +) +rownames(initial_conditions) <- rownames(contact_matrix) + +# prepare population object +uk_population <- population( + name = "UK", + contact_matrix = contact_matrix, + demography_vector = demography_vector, + initial_conditions = initial_conditions +) + +no_vaccination <- no_vaccination(population = uk_population, doses = 2) + +# run model +output <- model_vacamole_cpp( + population = uk_population, + vaccination = no_vaccination, + time_end = 300 +) +``` + +2. Plot the number of deaths through time + +```{r} +ggplot(output[output$compartment == "dead", ]) + + geom_line( + aes(time, value, colour = demography_group), + linewidth = 1 + ) + + scale_colour_brewer( + palette = "Dark2", + labels = rownames(contact_matrix), + name = "Age group" + ) + + scale_y_continuous( + labels = scales::comma + ) + + labs( + x = "Simulation time (days)", + y = "Individuals" + ) + + theme( + legend.position = "top" + ) + + theme_bw( + base_size = 15 + ) +``` - - +::::::::::::::::::::::::::: - +:::::::::::::::::::::::::::::::::::::::::::::::: - +## Comparing scenarios +*Coming soon* - +## Challenge : Ebola outbreak analysis + +*Coming soon* - diff --git a/episodes/model-choices.Rmd b/episodes/model-choices.Rmd index a0cf899c..f839578d 100644 --- a/episodes/model-choices.Rmd +++ b/episodes/model-choices.Rmd @@ -13,15 +13,14 @@ library(epidemics) :::::::::::::::::::::::::::::::::::::: questions -- How do I choose a model for my task? +- How do I choose a mathematical model that's appropriate to complete my analytical task? :::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::: objectives -- Learn how to access the model library in `epidemics` -- Understand the model requirements for a question +- Understand the model requirements for a specific research question :::::::::::::::::::::::::::::::::::::::::::::::: @@ -34,7 +33,7 @@ library(epidemics) ## Introduction -Using mathematical models in outbreak analysis does not necessarily require developing a new model. There are existing models for different infections, interventions and transmission patterns which can be used to answer new questions. In this tutorial, we will learn how to choose an existing model to generate predictions for a given scenario. +There are existing mathematical models for different infections, interventions and transmission patterns which can be used to answer new questions. In this tutorial, we will learn how to choose an existing model to complete a given task. :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor @@ -42,49 +41,77 @@ The focus of this tutorial is understanding existing models to decide if they ar :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: +::::::::::::::::::::::::::::::::::::::: discussion + ### Choosing a model -When deciding whether an existing model can be used, we must consider : +When deciding which mathematical model to use, there are a number of questions we must consider : -+ What is the infection/disease of interest? +::::::::::::::::::::::::::::::::::::::::::::::::::: + +:::::::::::::::: solution + +### What is the infection/disease of interest? A model may already exist for your study disease, or there may be a model for an infection that has the same transmission pathways and epidemiological features that can be used. -+ Do we need a [deterministic](../learners/reference.md#deterministic) or [stochastic](../learners/reference.md#stochastic) model? +::::::::::::::::::::::::: + +:::::::::::::::: solution + +### Do we need a [deterministic](../learners/reference.md#deterministic) or [stochastic](../learners/reference.md#stochastic) model? Model structures differ for whether the disease has pandemic potential or not. When predicted numbers of infection are small, stochastic variation in output can have an effect on whether an outbreak takes off or not. Outbreaks are usually smaller in magnitude than epidemics, so its often appropriate to use a stochastic model to characterise the uncertainty in the early stages of the outbreak. Epidemics are larger in magnitude than outbreaks and so a deterministic model is suitable as we have less interest in the stochastic variation in output. -+ What is the outcome of interest? +::::::::::::::::::::::::: + +:::::::::::::::: solution + +## What is the outcome of interest? The outcome of interest can be a feature of a mathematical model. It may be that you are interested in the predicted numbers of infection through time, or in a specific outcome such as hospitalisations or cases of severe disease. -+ Will any interventions be modelled? +::::::::::::::::::::::::: -Finally, interventions such as vaccination may be of interest. A model may or may not have the capability to include the impact of different interventions on different time scales (continuous time or at discrete time points). We will discuss interventions in detail in the next tutorial. +:::::::::::::::: solution -### Available models - -The R package `epidemics` contains functions to run existing models. -For details on the models that are available, see the package [vignettes](https://epiverse-trace.github.io/epidemics/articles). To learn how to run the models in R, read the documentation using `?epidemics::model_ebola_r`. Remember to use the questions in the '[Check model equation](#check-model-equations)' checklist to help your understanding of an existing model. +## How is transmission modelled? -::::::::::::::::::::::::::::::::::::: checklist -### Check model equations +For example, [direct](../learners/reference.md#direct) or [indirect](../learners/reference.md#indirect), [airborne](../learners/reference.md#airborne) or [vector-borne](../learners/reference.md#vectorborne). +::::::::::::::::::::::::: -- How is transmission modelled? e.g. direct or indirect, airborne or vector-borne -- What interventions are modelled? -- What state variables are there and how do they relate to assumptions about infection? -:::::::::::::::::::::::::::::::::::::::::::::::: +:::::::::::::::: solution + +## How are the different processes (e.g. transmission) formulated in the equations? + +There can be subtle differences in model structures for the same infection or outbreak type which can be missed without studying the equations. For example, transmissibility parameters can be specified as rates or probabilities. If you want to use parameter values from other published models, you must check that transmission is formulated in the same way. +::::::::::::::::::::::::: + +:::::::::::::::: solution + +## Will any interventions be modelled? +Finally, interventions such as vaccination may be of interest. A model may or may not have the capability to include the impact of different interventions on different time scales (continuous time or at discrete time points). We discuss interventions in detail in the tutorial [Modelling interventions](../episodes/modelling-interventions.md). +::::::::::::::::::::::::: + + + + + + +## Available models in `epidemics` + +The R package `epidemics` contains functions to run existing models. +For details on the models that are available, see the package [vignettes](https://epiverse-trace.github.io/epidemics/articles). To learn how to run the models in R, read the documentation using `?epidemics::model_ebola_r`. -## Challenge ::::::::::::::::::::::::::::::::::::: challenge ## What model? -You have been asked to explore the variation in numbers of infected individuals in the early stages of an Ebola outbreak. +You have been asked to explore the variation in numbers of infectious individuals in the early stages of an Ebola outbreak. Which of the following models would be an appropriate choice for this task: @@ -127,17 +154,20 @@ A deterministic SEIR model with age specific direct transmission. ```{r diagram, echo = FALSE, message = FALSE} DiagrammeR::grViz("digraph { + # graph statement ################# graph [layout = dot, rankdir = LR, overlap = true, fontsize = 10] + # nodes ####### node [shape = square, - fixedsize = true, + fixedsize = true width = 1.3] + S E I @@ -145,9 +175,10 @@ DiagrammeR::grViz("digraph { # edges ####### - S -> E [label = ' infection'] - E -> I [label = ' onset of \ninfectiousness'] - I -> R [label = ' recovery'] + S -> E [label = ' infection \n(transmissibility β)'] + E -> I [label = ' onset of infectiousness \n(infectiousness rate α)'] + I -> R [label = ' recovery \n(recovery rate γ)'] + }") ``` @@ -157,7 +188,16 @@ The model is capable of predicting an Ebola type outbreak, but as the model is d #### `model_ebola_r()` -A stochastic SEIHFR (Susceptible, Exposed, Infectious, Hospitalised, Funeral, Removed) model that was developed specifically for infection with Ebola. +A stochastic SEIHFR (Susceptible, Exposed, Infectious, Hospitalised, Funeral, Removed) model that was developed specifically for infection with Ebola. The model has stochasticity in the passage times between states, which are modelled as Erlang distributions. + +The key parameters affecting the transition between states are: + ++ $R_0$, the basic reproduction number, ++ $\rho^I$, the mean infectious period, ++ $\rho^E$, the mean preinfectious period, ++ $p_{hosp}$ the probability of being transferred to the hospitalised compartment. + +**Note: the functional relationship between the preinfectious period ($\rho^E$) and the transition rate between exposed and infectious ($\gamma^E$) is $\rho^E = k^E/\gamma^E$ where $k^E$ is the shape of the Erlang distribution. Similarly for the infectious period $\rho^I = k^I/\gamma^I$. See [here](https://epiverse-trace.github.io/epidemics/articles/ebola_model.html#details-discrete-time-ebola-virus-disease-model) for more detail on the stochastic model formulation. ** ```{r, echo = FALSE, message = FALSE} DiagrammeR::grViz("digraph { @@ -184,12 +224,12 @@ DiagrammeR::grViz("digraph { # edges ####### - S -> E [label = ' infection '] - E -> I [label = ' onset of \ninfectiousness'] - I -> F [label = ' death \n(funeral) '] - F -> R [label = ' safe burial'] - I -> H [label = ' hospitalisation'] - H -> R [label = ' recovery or \nsafe burial'] + S -> E [label = ' infection (β)'] + E -> I [label = ' onset of \ninfectiousness (γ E)'] + I -> F [label = ' death (funeral) \n(γ I)'] + F -> R [label = ' safe burial (one timestep) '] + I -> H [label = ' hospitalisation (p hosp)'] + H -> R [label = ' recovery or safe burial \n (γ I)'] subgraph { rank = same; I; F; @@ -200,7 +240,10 @@ DiagrammeR::grViz("digraph { }") ``` +The model has additional parameters describing the transmission risk in hospital and funeral settings: ++ $p_{ETU}$, the proportion of hospitalised cases contributing to the infection of susceptibles (ETU = Ebola virus treatment units), ++ $p_{funeral}$, the proportion of funerals at which the risk of transmission is the same as of infectious individuals in the community. As this model is stochastic, it is the most appropriate choice to explore how variation in numbers of infected individuals in the early stages of an Ebola outbreak. @@ -211,9 +254,162 @@ As this model is stochastic, it is the most appropriate choice to explore how va :::::::::::::::::::::::::::::::::::::::::::::::: +## Challenge : Ebola outbreak analysis + + + +::::::::::::::::::::::::::::::::::::: challenge + +## Running the model + +You have been tasked to generate initial trajectories of an Ebola outbreak in Guinea. Using `model_ebola_r()` and the the information detailed below, complete the following tasks: + +1. Run the model once and plot the number of infectious individuals through time +2. Run model 100 times and plot the mean, upper and lower 95% quantiles of the number of infectious individuals through time + ++ Population size : 14 million ++ Initial number of exposed individuals : 10 ++ Initial number of infectious individuals : 5 ++ Time of simulation : 120 days ++ Parameter values : + + $R_0$ (`r0`) = 1.1, + + $p^I$ (`infectious_period`) = 12, + + $p^E$ (`preinfectious_period`) = 5, + + $k^E=k^I = 2$, + + $1-p_{hosp}$ (`prop_community`) = 0.9, + + $p_{ETU}$ (`etu_risk`) = 0.7, + + $p_{funeral}$ (`funeral_risk`) = 0.5 + +::::::::::::::::: hint + +### Code for initial conidtions + +```{r} +# set population size +population_size <- 14e6 + +E0 <- 10 +I0 <- 5 +# prepare initial conditions as proportions +initial_conditions <- c( + S = population_size - (E0 + I0), E = E0, I = I0, H = 0, F = 0, R = 0 +) / population_size + +guinea_population <- population( + name = "Guinea", + contact_matrix = matrix(1), # note dummy value + demography_vector = population_size, # 14 million, no age groups + initial_conditions = matrix( + initial_conditions, + nrow = 1 + ) +) +``` + + +:::::::::::::::::::::: + + +::::::::::::::::: hint + +### HINT : Multiple model simulations + +Adapt the code from the [accounting for uncertainty](../episodes/simulating-transmission.md#accounting-for-uncertainty) section + +:::::::::::::::::::::: + +::::::::::::::::: solution + +### SOLUTION + +1. Run the model once and plot the number of infectious individuals through time + + +```{r} +output <- model_ebola_r( + population = guinea_population, + transmissibility = 1.1 / 12, + infectiousness_rate = 2.0 / 5, + removal_rate = 2.0 / 12, + prop_community = 0.9, + etu_risk = 0.7, + funeral_risk = 0.5, + time_end = 100 +) + +ggplot(output[output$compartment == "infectious", ]) + + geom_line( + aes(time, value), + linewidth = 1.2 + ) + + scale_y_continuous( + labels = scales::comma + ) + + labs( + x = "Simulation time (days)", + y = "Individuals" + ) + + theme_bw( + base_size = 15 + ) +``` + +2. Run model 100 times and plot the mean, upper and lower 95% quantiles of the number of infectious individuals through time + +We run the model 100 times with the *same* parameter values. + +```{r} +output_samples <- Map( + x = seq(100), + f = function(x) { + output <- model_ebola_r( + population = guinea_population, + transmissibility = 1.1 / 12, + infectiousness_rate = 2.0 / 5, + removal_rate = 2.0 / 12, + prop_community = 0.9, + etu_risk = 0.7, + funeral_risk = 0.5, + time_end = 100 + ) + # add replicate number and return data + output$replicate <- x + output + } +) + +output_samples <- bind_rows(output_samples) # requires the tidyverse package + +ggplot(output_samples[output_samples$compartment == "infectious", ], aes(time, value)) + + stat_summary(geom = "line", fun = mean) + + stat_summary( + geom = "ribbon", + fun.min = function(z) { + quantile(z, 0.025) + }, + fun.max = function(z) { + quantile(z, 0.975) + }, + alpha = 0.3 + ) + + labs( + x = "Simulation time (days)", + y = "Individuals" + ) + + theme_bw( + base_size = 15 + ) +``` + +::::::::::::::::::::::::::: + + +:::::::::::::::::::::::::::::::::::::::::::::::: + + ::::::::::::::::::::::::::::::::::::: keypoints -- Existing models can be used for new questions -- Check that a model has appropriate assumptions about transmission, outbreak potential, outcomes and interventions +- Existing mathematical models should be selected according to the research question +- It is important to check that a model has appropriate assumptions about transmission, outbreak potential, outcomes and interventions :::::::::::::::::::::::::::::::::::::::::::::::: diff --git a/episodes/modelling-interventions.Rmd b/episodes/modelling-interventions.Rmd index ef44582c..cf3878f7 100644 --- a/episodes/modelling-interventions.Rmd +++ b/episodes/modelling-interventions.Rmd @@ -19,7 +19,7 @@ library(epidemics) ::::::::::::::::::::::::::::::::::::: objectives -- Learn how to implement pharmaceutical and non-pharmaceutical interventions +- Add pharmaceutical and non-pharmaceutical interventions to an {epidemics} model :::::::::::::::::::::::::::::::::::::::::::::::: @@ -28,7 +28,7 @@ library(epidemics) ## Prerequisites + Complete tutorial [Simulating transmission](../episodes/simulating-transmission.md) -This tutorial has the following concept dependencies: +Learners should familiarise themselves with following concept dependencies before working through this tutorial: **Outbreak response** : [Intervention types](https://www.cdc.gov/nonpharmaceutical-interventions/). ::::::::::::::::::::::::::::::::: @@ -36,7 +36,9 @@ This tutorial has the following concept dependencies: ## Introduction -Mathematical models can be used to generate predictions for the implementation of non-pharmaceutical and pharmaceutical interventions at different stages of an outbreak. In this tutorial, we will introduce how to include different interventions in models. +Mathematical models can be used to generate trajectories of disease spread under the implementation of interventions at different stages of an outbreak. These predictions can be used to make decisions on what interventions could be implemented to slow down the spread of diseases. + +We can assume interventions in mathematical models reduce the values of relevant parameters e.g. reduce transmissibility while in place. Or it may be appropriate to assume individuals are classified into a new disease state, e.g. once vaccinated we assume individuals are no longer susceptible to infection and therefore move to a vaccinated state. In this tutorial, we will introduce how to include three different interventions in model of COVID-19 transmission. :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor @@ -46,13 +48,11 @@ In this tutorial different types of intervention and how they can be modelled ar ## Non-pharmaceutical interventions -Non-pharmaceutical interventions (NPIs) are measures put in place to reduce transmission that do not include taking medicine or vaccines. NPIs aim reduce contact between infectious and susceptible individuals. For example, washing hands, wearing masks and closures of school and workplaces. - -In mathematical modelling, we must make assumptions about how NPIs will affect transmission. This may include adding additional disease states or reducing the value of relevant parameters. +[Non-pharmaceutical interventions](../learners/reference.md#NPIs) (NPIs) are measures put in place to reduce transmission that do not include the administration of drugs or vaccinations. NPIs aim reduce contact between infectious and susceptible individuals. For example, washing hands, wearing masks and closures of school and workplaces. -#### Effect of school closures on COVID-19 spread +We will investigate the effect of interventions on a COVID-19 outbreak using an SEIR model (`model_default_cpp()` in the R package `{epidemics}`). We will set $R_0 = 2.7$, latent period or preinfectious period $= 4$ and the infectious_period $= 5.5$ (parameters adapted from [Davies et al. (2020)](https://doi.org/10.1016/S2468-2667(20)30133-X)). We load a contact matrix with age bins 0-18, 18-65, 65 years and older using `{socialmixr}` and assume that one in every 1 million in each age group is infectious at the start of the epidemic. -```{r model_setup, echo = FALSE, message = FALSE} +```{r model_setup, echo = TRUE, message = FALSE} polymod <- socialmixr::polymod contact_data <- socialmixr::contact_matrix( polymod, @@ -88,27 +88,34 @@ uk_population <- population( demography_vector = demography_vector, initial_conditions = initial_conditions ) - -# simulate a pandemic, with an R0, -# an infectious period, and an pre-infectious period -covid <- infection( - r0 = 2.7, - preinfectious_period = 4, - infectious_period = 5.5 -) ``` +#### Effect of school closures on COVID-19 spread -We want to investigate the effect of school closures on reducing the number of individuals infectious with COVID-19 through time. We assume that a school closure will reduce the frequency of contacts within and between different age groups. +The first NPI we will consider is the effect of school closures on reducing the number of individuals infectious with COVID-19 through time. We assume that a school closure will reduce the frequency of contacts within and between different age groups. We assume that school closures will reduce the contacts between school aged children (aged 0-15) by 0.5, and will cause a small reduction (0.01) in the contacts between adults (aged 15 and over). -Using an SEIR model (`model_default_cpp()` in the R package `{epidemics}`) we set $R_0 = 2.7$, preinfectious period $= 4$ and the infectious_period $= 5.5$ (parameters adapted from [Davies et al. (2020)](https://doi.org/10.1016/S2468-2667(20)30133-X)). We load a contact matrix with age bins 0-18, 18-65, 65 years and older using `{socialmixr}` and assume that one in every 1 million in each age group is infectious at the start of the epidemic. +To include an intervention in our model we must create an `intervention` object. The inputs are the name of the intervention (`name`), the type of intervention (`contacts` or `rate`), the start time (`time_begin`), the end time (`time_end`) and the reduction (`reduction`). The values of the reduction matrix are specified in the same order as the age groups in the contact matrix. -We will assume that school closures will reduce the contacts between school aged children (aged 0-15) by 0.5, and will cause a small reduction (0.01) in the contacts between adults (aged 15 and over). +```{r} +rownames(contact_matrix) +``` + +Therefore, we specify ` reduction = matrix(c(0.5, 0.01, 0.01))`. We assume that the school closures start on day 50 and are in place for a further 100 days. Therefore our intervention object is : + +```{r intervention} +close_schools <- intervention( + name = "School closure", + type = "contacts", + time_begin = 50, + time_end = 50 + 100, + reduction = matrix(c(0.5, 0.01, 0.01)) +) +``` ::::::::::::::::::::::::::::::::::::: callout ### Effect of interventions on contacts -The contact matrix is scaled down by proportions for the period in which the intervention is in place. To explain the reduction, consider a contact matrix for two age groups with equal number of contacts: +In `epidemics`, the contact matrix is scaled down by proportions for the period in which the intervention is in place. To understand how the reduction is calculated within the model functions, consider a contact matrix for two age groups with equal number of contacts: ```{r echo = FALSE} reduction <- matrix(c(0.5, 0.1)) @@ -130,77 +137,57 @@ The contacts within group 1 are reduced by 50% twice to accommodate for a 50% re :::::::::::::::::::::::::::::::::::::::::::::::: -To include an intervention in our model we must create an `intervention` object. The inputs are the name of the intervention (`name`), the type of intervention (`contacts` or `rate`), the start time (`time_begin`), the end time (`time_end`) and the reduction (`reduction`). The values of the reduction matrix are specified in the same order as the age groups in the contact matrix. - -```{r} -rownames(contact_matrix) -``` - -Therefore, we specify ` reduction = matrix(c(0.5, 0.01, 0.01))`. We assume that the school closures start on day 50 and are in place for a further 100 days. Therefore our intervention object is : +Using transmissibility $= 2.7/5.5$ (remember that [transmissibility = $R_0$/ infectious period](../episodes/simulating-transmission.md#the-basic-reproduction-number-r_0)), infectiousness rate $1/= 4$ and the recovery rate $= 1/5.5$ we run the model with` intervention = list(contacts = close_schools)` as follows : -```{r intervention} -close_schools <- intervention( - name = "School closure", - type = "contacts", - time_begin = 50, - time_end = 50 + 100, - reduction = matrix(c(0.5, 0.01, 0.01)) -) -``` - -```{r baseline, echo = FALSE} -output <- model_default_cpp( +```{r school} +output_school <- model_default_cpp( population = uk_population, - infection = covid, + transmissibility = 2.7 / 5.5, + infectiousness_rate = 1.0 / 4.0, + recovery_rate = 1.0 / 5.5, + intervention = list(contacts = close_schools), time_end = 300, increment = 1.0 ) ``` -To run the model with an intervention we set ` intervention = list(contacts = close_schools)` as follows: +To be able to see the effect of our intervention, we also run the model where there is no intervention, combine the two outputs into one data frame and then plot the output. Here we plot the total number of infectious individuals in all age groups using `ggplot2::stat_summary()`: -```{r school} -output_school <- model_default_cpp( +```{r baseline, echo = TRUE, fig.width = 10} +# run baseline simulation with no intervention +output_baseline <- model_default_cpp( population = uk_population, - infection = covid, - intervention = list(contacts = close_schools), + transmissibility = 2.7 / 5.5, + infectiousness_rate = 1.0 / 4.0, + recovery_rate = 1.0 / 5.5, time_end = 300, increment = 1.0 ) -``` - -We see that with the intervention (solid line) in place, the infection still spreads through the population, though the epidemic peak is smaller than the baseline with no intervention in place (dashed line). - -```{r plot_school, echo = FALSE, message = FALSE, fig.width = 10} -ggplot() + - aes(x = time, y = value) + - stat_summary( - data = output_school[compartment == "infectious", ], - fun = sum, - color = "black", - geom = "line", - linewidth = 1 +# create intervention_type column for plotting +output_school$intervention_type <- "school closure" +output_baseline$intervention_type <- "baseline" +output <- rbind(output_school, output_baseline) + +ggplot(data = output[output$compartment == "infectious", ]) + + aes( + x = time, + y = value, + color = intervention_type, + linetype = intervention_type ) + stat_summary( - data = output[compartment == "infectious", ], - fun = sum, - color = "black", + fun = "sum", geom = "line", - linewidth = 1, - linetype = "dashed" + linewidth = 1 ) + scale_y_continuous( - labels = scales::comma, - name = "Infectious indivduals" + labels = scales::comma ) + labs( - x = "Model time (days)" + x = "Simulation time (days)", + y = "Individuals" ) + - theme_classic() + - theme( - legend.position = "top" - ) + - theme_grey( + theme_bw( base_size = 15 ) + geom_vline( @@ -220,12 +207,15 @@ ggplot() + vjust = "outward" ) ``` +We see that with the intervention in place, the infection still spreads through the population, though the peak number of infectious individuals is smaller than the baseline with no intervention in place (solid line). + + #### Effect of mask wearing on COVID-19 spread We can model the effect of other NPIs as reducing the value of relevant parameters. For example, we want to investigate the effect of mask wearing on the number of individuals infectious with COVID-19 through time. -We expect that mask wearing will reduce an individual's infectiousness. As we are using a population based model, we cannot make changes to individual behaviour and so assume that the transmission rate $\beta$ is reduced by a proportion due to mask wearing in the population. We specify this proportion, $\theta$ as product of the proportion wearing masks multiplied by the proportion reduction in transmissibility (adapted from [Li et al. 2020](https://doi.org/10.1371/journal.pone.0237691)) +We expect that mask wearing will reduce an individual's infectiousness. As we are using a population based model, we cannot make changes to individual behaviour and so assume that the transmissibility $\beta$ is reduced by a proportion due to mask wearing in the population. We specify this proportion, $\theta$ as product of the proportion wearing masks multiplied by the proportion reduction in transmissibility (adapted from [Li et al. 2020](https://doi.org/10.1371/journal.pone.0237691)) We create an intervention object with `type = rate` and `reduction = 0.161`. Using parameters adapted from [Li et al. 2020](https://doi.org/10.1371/journal.pone.0237691) we have proportion wearing masks = coverage $\times$ availability = $0.54 \times 0.525 = 0.2835$, proportion reduction in transmissibility = $0.575$. Therefore, $\theta = 0.2835 \times 0.575 = 0.163$. We assume that the mask wearing mandate starts at day 40 and is in place for 200 days. @@ -244,44 +234,41 @@ To implement this intervention on the parameter $\beta$, we specify `interventio ```{r output_masks} output_masks <- model_default_cpp( population = uk_population, - infection = covid, - intervention = list(beta = mask_mandate), + transmissibility = 2.7 / 5.5, + infectiousness_rate = 1.0 / 4.0, + recovery_rate = 1.0 / 5.5, + intervention = list(transmissibility = mask_mandate), time_end = 300, increment = 1.0 ) ``` +```{r plot_masks, echo = TRUE, message = FALSE, fig.width = 10} +# create intervention_type column for plotting +output_masks$intervention_type <- "mask mandate" +output_baseline$intervention_type <- "baseline" +output <- rbind(output_masks, output_baseline) -```{r plot_masks, echo = FALSE, message = FALSE, fig.width = 10} -ggplot() + - aes(x = time, y = value) + - stat_summary( - data = output_masks[compartment == "infectious", ], - fun = sum, - color = "black", - geom = "line", - linewidth = 1 +ggplot(data = output[output$compartment == "infectious", ]) + + aes( + x = time, + y = value, + color = intervention_type, + linetype = intervention_type ) + stat_summary( - data = output[compartment == "infectious", ], - fun = sum, - color = "black", + fun = "sum", geom = "line", - linewidth = 1, - linetype = "dashed" + linewidth = 1 ) + scale_y_continuous( - labels = scales::comma, - name = "Infectious indivduals" + labels = scales::comma ) + labs( - x = "Model time (days)" - ) + - theme_classic() + - theme( - legend.position = "top" + x = "Simulation time (days)", + y = "Individuals" ) + - theme_grey( + theme_bw( base_size = 15 ) + geom_vline( @@ -302,10 +289,21 @@ ggplot() + ) ``` +::::::::::::::::::::::::::::::::::::: callout +### Intervention types + +There are two intervention types for `model_default_cpp()`. Rate interventions on model parameters (`transmissibillity` $\beta$, `infectiousness_rate` $\sigma$ and `recovery_rate` $\gamma$) and contact matrix reductions `contacts`. + +To implement both contact and rate interventions in the same simulation they must be passed as a list e.g. `intervention = list(transmissibility = mask_mandate, contacts = close_schools)`. But if there are multiple interventions that target contact rates, these must be passed as one `contacts` input. See the [vignette on modelling overlapping interventions](https://epiverse-trace.github.io/epidemics/articles/multiple_interventions.html) for more detail. + +:::::::::::::::::::::::::::::::::::::::::::::::: + ## Pharmaceutical interventions -Models can be used to investigate the effect of pharmaceutical interventions, such as vaccination. In this case, it is useful to add another disease state to track the number of vaccinated individuals through time. The diagram below shows an SEIRV model where susceptible individuals are vaccinated and then move to the $V$ class. +Pharmaceutical interventions (PIs) are measures such as vaccination and mass treatment programs. In the previous section, we assumed that interventions reduced the value of parameter values while the intervention was in place. In the case of vaccination, we assume that after the intervention individuals are no longer susceptible and should be classified into a different disease state. Therefore, we specify the rate at which individuals are vaccinated and track the number of vaccinated individuals through time. + +The diagram below shows the SEIRV model implemented using `model_default_cpp()` where susceptible individuals are vaccinated and then move to the $V$ class. ```{r diagram_SEIRV, echo = FALSE, message = FALSE} DiagrammeR::grViz("digraph { @@ -339,6 +337,8 @@ DiagrammeR::grViz("digraph { }") ``` + + The equations describing this model are as follows: $$ @@ -350,10 +350,11 @@ $$ \frac{dV_i}{dt} & =\nu_{i,t} S_i\\ \end{aligned} $$ -Individuals are vaccinated at an age group ($i$) specific time dependent ($t$) vaccination rate ($\nu$). The SEIR components of these equations are described in the tutorial Simulating transmission. +Individuals are vaccinated at an age group ($i$) specific time dependent ($t$) vaccination rate ($\nu_{i,t}$). The SEIR components of these equations are described in the tutorial [simulating transmission](../episodes/simulating-transmission.md#simulating-disease-spread). -To explore the effect of vaccination we need to create a vaccination object. As vaccination is age group specific, we must pass an age groups specific vaccination rate $\nu$ and age group specific start and end times of the vaccination program. Here we will assume all age groups are vaccinated at the same rate and that the vaccination program starts on day 40 and is in place for 150 days. +To explore the effect of vaccination we need to create a vaccination object to pass as an input into `model_default_cpp()` that includes an age groups specific vaccination rate `nu` and age group specific start and end times of the vaccination program (`time_begin` and `time_end`). +Here we will assume all age groups are vaccinated at the same rate 0.01 and that the vaccination program starts on day 40 and is in place for 150 days. ```{r vaccinate} # prepare a vaccination object @@ -370,71 +371,72 @@ We pass our vaccination object using `vaccination = vaccinate`: ```{r output_vaccinate} output_vaccinate <- model_default_cpp( population = uk_population, - infection = covid, + transmissibility = 2.7 / 5.5, + infectiousness_rate = 1.0 / 4.0, + recovery_rate = 1.0 / 5.5, vaccination = vaccinate, time_end = 300, increment = 1.0 ) ``` -Here we see that the total number of infectious individuals when vaccination is in place is much lower compared to school closures and mask wearing interventions. -```{r plot_vaccinate, echo = FALSE, message = FALSE, fig.width = 10} -ggplot() + - aes(x = time, y = value) + - stat_summary( - data = output_vaccinate[compartment == "infectious", ], - fun = sum, - color = "black", - geom = "line", - linewidth = 1 +::::::::::::::::::::::::::::::::::::: challenge + +## Compare interventions + +Plot the three interventions vaccination, school closure and mask mandate and the baseline simulation on one plot. Which intervention reduces the peak number of infectious individuals the most? + + +:::::::::::::::::::::::: solution + +## Output + +```{r plot_vaccinate, echo = TRUE, message = FALSE, fig.width = 10} +# create intervention_type column for plotting +output_vaccinate$intervention_type <- "vaccination" +output <- rbind(output_school, output_masks, output_vaccinate, output_baseline) + +ggplot(data = output[output$compartment == "infectious", ]) + + aes( + x = time, + y = value, + color = intervention_type, + linetype = intervention_type ) + stat_summary( - data = output[compartment == "infectious", ], - fun = sum, - color = "black", + fun = "sum", geom = "line", - linewidth = 1, - linetype = "dashed" + linewidth = 1 ) + scale_y_continuous( - labels = scales::comma, - name = "Infectious indivduals" + labels = scales::comma ) + labs( - x = "Model time (days)" - ) + - theme_classic() + - theme( - legend.position = "top" + x = "Simulation time (days)", + y = "Individuals" ) + - theme_grey( + theme_bw( base_size = 15 - ) + - geom_vline( - xintercept = c(vaccinate$time_begin, vaccinate$time_end), - colour = "black", - linetype = "dashed", - linewidth = 0.2 - ) + - annotate( - geom = "text", - label = "Vaccination", - colour = "black", - x = (vaccinate$time_end - vaccinate$time_begin) / 2 + vaccinate$time_begin, - y = 10, - angle = 0, - vjust = "outward" ) ``` +From the plot we see that the peak number of total number of infectious individuals when vaccination is in place is much lower compared to school closures and mask wearing interventions. + +::::::::::::::::::::::::::::::::: +:::::::::::::::::::::::::::::::::::::::::::::::: + + + ## Summary -Modelling interventions requires assumptions of how interventions affect model parameters such as contact matrices or parameter values. Next we want quantify the effect of an interventions. In the next tutorial, we will learn how to compare intervention scenarios against each other. +Different types of intervention can be implemented using mathematical modelling. Modelling interventions requires assumptions of which model parameters are affected (e.g. contact matrices, transmissibility), by what magnitude and and what times in the simulation of an outbreak. +The next step is to quantify the effect of an interventions. If you are interested in learning how to compare interventions, please complete the tutorial [Comparing public health outcomes of interventions](../episodes/compare-interventions.md). ::::::::::::::::::::::::::::::::::::: keypoints -- Different types of intervention can be implemented using mathematical modelling +- The effect of NPIs can be modelled as reducing contact rates between age groups or reducing the transmissibility of infection +- Vaccination can be modelled by assuming individuals move to a different disease state $V$ :::::::::::::::::::::::::::::::::::::::::::::::: diff --git a/episodes/simulating-transmission.Rmd b/episodes/simulating-transmission.Rmd index 0d656a6f..b88352a7 100644 --- a/episodes/simulating-transmission.Rmd +++ b/episodes/simulating-transmission.Rmd @@ -6,6 +6,7 @@ exercises: 30 # exercise time in minutes ```{r setup, echo= FALSE, message = FALSE, warning = FALSE} require(ggplot2) +require(dplyr) require(testthat) require(tidyverse) require(DiagrammeR) @@ -17,18 +18,18 @@ webshot::install_phantomjs(force = TRUE) :::::::::::::::::::::::::::::::::::::: questions -- How do I generate predictions of disease trajectories? +- How do I simulate disease spread using a mathematical model? - What inputs are needed for a model simulation? +- How do I account for uncertainty? :::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::: objectives -Using the R package `epidemics`, learn how to: - -- load an existing model structure, -- load an existing social contact matrix, -- run a model simulation. +- Load an existing model structure from `{epidemics}` R package +- Load an existing social contact matrix with `{socialmixr}` +- Generate a disease spread model simulation with `{epidemics}` +- Generate multiple model simulations and visualise uncertainty :::::::::::::::::::::::::::::::::::::::::::::::: @@ -36,9 +37,9 @@ Using the R package `epidemics`, learn how to: ## Prerequisites -This tutorial has the following concept dependencies: +Learners should familiarise themselves with following concept dependencies before working through this tutorial: -**Modelling** : [Components of infectious disease models](https://doi.org/10.1038/s41592-020-0856-2) e.g. state variables, parameters, initial conditions, and ordinary differential equations. +**Mathematical Modelling** : [Introduction to infectious disease models](https://doi.org/10.1038/s41592-020-0856-2), [state variables](../learners/reference.md#state), [model parameters](../learners/reference.md#parsode), [initial conditions](../learners/reference.md#initial), [ordinary differential equations](../learners/reference.md#ordinary). **Epidemic theory** : [Transmission](https://doi.org/10.1155/2011/267049), [Reproduction number](https://doi.org/10.3201/eid2501.171901). ::::::::::::::::::::::::::::::::: @@ -47,11 +48,9 @@ This tutorial has the following concept dependencies: ## Introduction -Mathematical models are useful tools for generating future trajectories of disease spread. Models can be used to evaluate the implementation of non-pharmaceutical and pharmaceutical interventions while accounting for factors such as age. - -In this tutorial, we will use the R package `{epidemics}` to generate trajectories of influenza spread. By the end of this tutorial, you will be able to generate the trajectory below showing the number of infectious individuals in different age categories over time. +Mathematical models are useful tools for generating future trajectories of disease spread. In this tutorial, we will use the R package `{epidemics}` to generate disease trajectories of an influenza strain with pandemic potential. By the end of this tutorial, you will be able to generate the trajectory below showing the number of infectious individuals in different age categories over time. -```{r traj, echo = FALSE, message= FALSE, fig.width = 10} +```{r traj, echo = FALSE, message = FALSE, fig.width = 10, eval = TRUE} # load contact and population data from socialmixr::polymod polymod <- socialmixr::polymod contact_data <- socialmixr::contact_matrix( @@ -94,45 +93,50 @@ uk_population <- population( initial_conditions = initial_conditions ) -# simulate a pandemic, with an R0, -# an infectious period, and an pre-infectious period -influenza <- infection( - name = "influenza", - r0 = 1.46, - preinfectious_period = 3, - infectious_period = 7 -) - # run an epidemic model using `epidemic()` -output <- model_default_cpp( +output_plot <- model_default_cpp( population = uk_population, - infection = influenza, + transmissibility = 1.46 / 7.0, + infectiousness_rate = 1.0 / 3.0, + recovery_rate = 1.0 / 7.0, time_end = 600, increment = 1.0 ) -ggplot(output[compartment == "infectious", ]) + +filter(output_plot, compartment %in% c("exposed", "infectious")) %>% + ggplot( + aes( + x = time, + y = value, + col = demography_group, + linetype = compartment + ) + ) + geom_line( - aes(time, value, colour = demography_group), - linewidth = 1 + linewidth = 1.2 + ) + + scale_y_continuous( + labels = scales::comma ) + scale_colour_brewer( palette = "Dark2", - labels = rownames(contact_matrix), name = "Age group" ) + - scale_y_continuous( - labels = scales::comma, - name = "Infectious indivduals" + expand_limits( + y = c(0, 500e3) ) + - labs( - x = "Model time (days)" + coord_cartesian( + expand = FALSE + ) + + theme_bw( + base_size = 15 ) + - theme_classic() + theme( legend.position = "top" ) + - theme_grey( - base_size = 15 + labs( + x = "Simulation time (days)", + linetype = "Compartment", + y = "Individuals" ) ``` @@ -143,36 +147,13 @@ By the end of this tutorial, learners should be able to replicate the above imag :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -The first step is to install the R packages `epidemics`. -*Note : this tutorial is based on a development version of {epidemics}. This version of the package can be installed using `{pak}`:* +## Simulating disease spread -```{r installation, eval = FALSE} -if (!require("pak")) install.packages("pak") -pak::pak("epiverse-trace/epidemics@96a7b1457") -``` - - -## Model structures To generate predictions of infectious disease trajectories, we must first select a mathematical model to use. -There is a library of models to choose from in `epidemics`. Models in `epidemics` are prefixed with `model` and suffixed by the name of infection (e.g. ebola) or a different identifier (e.g. default), and whether the model has a R or C++ code base. In this tutorial, we will use the default epidemic model, `model_default_cpp()` which is described in the next section. +There is a library of models to choose from in `epidemics`. Models in `epidemics` are prefixed with `model` and suffixed by the name of infection (e.g. Ebola) or a different identifier (e.g. default), and whether the model has a R or [C++](../learners/reference.md#cplusplus) code base. - -::::::::::::::::::::::::::::::::::::: callout -### Check model equations -When using existing model structures always check the model assumptions. Ask questions such as: - -- How is transmission modelled? e.g. [direct](../learners/reference.md#direct) or [indirect](../learners/reference.md#indirect), [airborne](../learners/reference.md#airborne) or [vector-borne](../learners/reference.md#vectorborne)? -- What interventions are modelled? -- What state variables are there and how do they relate to assumptions about infection? - -There can be subtle differences in model structures for the same infection or outbreak type which can be missed without studying the equations. -:::::::::::::::::::::::::::::::::::::::::::::::: - - -### An epidemic model for pandemic influenza - -We want to generate disease trajectories of an influenza strain with pandemic potential. We will use the default epidemic model in `epidemics`, an age-structured SEIR model described by a system of ordinary differential equations. For each age group $i$, individuals are classed as either susceptible $S$, infected but not yet infectious $E$, infectious $I$ or recovered $R$. +In this tutorial, we will use the default model in `epidemics`, `model_default_cpp()` which is an age-structured SEIR model described by a system of [ordinary differential equations](../learners/reference.md#ordinary). For each age group $i$, individuals are classed as either susceptible $S$, infected but not yet infectious $E$, infectious $I$ or recovered $R$. The schematic below shows the processes which describe the flow of individuals between the disease states $S$, $E$, $I$ and $R$ and the key parameters for each process. ```{r diagram, echo = FALSE, message = FALSE} DiagrammeR::grViz("digraph { @@ -197,23 +178,27 @@ DiagrammeR::grViz("digraph { # edges ####### - S -> E [label = ' infection (β)'] - E -> I [label = ' onset of \ninfectiousness (α)'] - I -> R [label = ' recovery (γ)'] + S -> E [label = ' infection \n(transmissibility β)'] + E -> I [label = ' onset of infectiousness \n(infectiousness rate α)'] + I -> R [label = ' recovery \n(recovery rate γ)'] }") ``` +::::::::::::::::::::::::::::::::::::: callout +### Model parameters : rates -The model parameters and equations are as follows : +In ODE models, model parameters are often (but not always) specified as rates. The rate at which an event occurs is the inverse of the average time until that event. For example, in the SEIR model, the recovery rate $\gamma$ is the inverse of the average infectious period. + +We can use knowledge of the natural history of the disease to inform our values of rates. If the average infectious period of an infection is 8 days, then the daily recovery rate is $\gamma = 1/8 = 0.125$. -- transmission rate $\beta$, -- contact matrix $C$ containing the frequency of contacts between age groups (a square $i \times j$ matrix), -- rate of transition from exposed to infectious $\alpha$ (preinfectious period=$1/\alpha$), -- recovery rate $\gamma$ (infectious period = $1/\gamma$). + +:::::::::::::::::::::::::::::::::::::::::::::::: +For each disease state ($S$, $E$, $I$ and $R$) and age group ($i$), we have an ordinary differential equation describing the rate of change with respect to time. + $$ \begin{aligned} \frac{dS_i}{dt} & = - \beta S_i \sum_j C_{i,j} I_j \\ @@ -222,15 +207,15 @@ $$ \frac{dR_i}{dt} &=\gamma I_i \\ \end{aligned} $$ +Individuals in age group ($i$) move from the susceptible state ($S_i$) to the exposed state ($E_i$) via age group specific contact with the infectious individuals in their own and other age groups $\beta S_i \sum_j C_{i,j} I_j$. The contact matrix $C$ allows for heterogeneity in contacts between age groups. They then move to the infectious state at a rate $\alpha$ and recover at a rate $\gamma$. There is no loss of immunity (there are no flows out of the recovered state). -The *contact matrix* is a square matrix consisting of rows/columns equal to the number age groups. Each element represents the frequency of contacts between age groups. If we believe that transmission of an infection is driven by contact, and that contact rates are very different for different age groups, then specifying a contact matrix allows us to account for age specific rates of transmission. - -From the model structure we see that : +The model parameters definitions are : -- the contact matrix $C$ allows for heterogeneity in contacts between age groups, -- there is no loss of immunity (there are no flows out of the recovered state). +- transmission rate or transmissibility $\beta$, +- [contact matrix](../learners/reference.md#contact) $C$ containing the frequency of contacts between age groups (a square $i \times j$ matrix), +- infectiousness rate $\alpha$ (preinfectious period ([latent period](../learners/reference.md#latent)) =$1/\alpha$), +- recovery rate $\gamma$ (infectious period = $1/\gamma$). -This model also has the functionality to include vaccination and tracks the number of vaccinated individuals through time. We will cover the use of interventions in future tutorials. ::::::::::::::::::::::::::::::::::::: callout ### Exposed, infected, infectious @@ -244,54 +229,14 @@ We will use the following definitions for our state variables: :::::::::::::::::::::::::::::::::::::::::::::::: -To generate trajectories using our model, we need the following : - -1. parameter values, -2. contact matrix, -3. demographic structure, -4. initial conditions. - -## Model parameters - -To run our model we need to specify the model parameters: - -- transmission rate $\beta$, -- rate of transition from exposed to infectious $\alpha$ (preinfectious period=$1/\alpha$), -- recovery rate $\gamma$ (infectious period=$1/\gamma$). - -We will learn how to specify the contact matrix $C$ in the next section. - -We will simulate a strain of influenza with pandemic potential with $R_0=1.5$, a preinfectious period of 3 days and infectious period of 7 days. - -In `epidemics`, we use the function `infection()` to create an infection object containing the values of, $R_0$, the preinfectious period ($1/\alpha$) and the infectious period ($1/\gamma$) as follows. - -```{r, eval = FALSE} -influenza <- infection( - name = "influenza", - r0 = 1.5, - preinfectious_period = 3, - infectious_period = 7 -) -``` - -::::::::::::::::::::::::::::::::::::: callout -### The basic reproduction number $R_0$ -The basic reproduction number, $R_0$, for the SEIR model is: - -$$ R_0 = \frac{\beta}{\gamma}.$$ - -Therefore, we can rewrite the transmission rate, $\beta$, as: - -$$ \beta = R_0 \gamma.$$ - - -:::::::::::::::::::::::::::::::::::::::::::::::: - - - +To generate trajectories using our model, we must prepare the following inputs : +1. Contact matrix +2. Initial conditions +3. Population structure +4. Model parameters -### Contact matrix +### 1. Contact matrix Contact matrices can be estimated from surveys or contact data, or synthetic ones can be used. We will use the R package `{socialmixr}` to load in a contact matrix estimated from POLYMOD survey data [(Mossong et al. 2008)](https://doi.org/10.1371/journal.pmed.0050074). @@ -340,7 +285,7 @@ contact_matrix ::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::: -The result is a square matrix with rows and columns for each age group. Contact matrices can be loaded from other sources, but they must be in the correct format to be used in `epidemics`. +The result is a square matrix with rows and columns for each age group. Contact matrices can be loaded from other sources, but they must be formatted as a matrix to be used in `epidemics`. ::::::::::::::::::::::::::::::::::::: callout ### Why would a contact matrix be non-symmetric? @@ -349,11 +294,7 @@ One of the arguments of the function `contact_matrix()` is `symmetric=TRUE`. Thi :::::::::::::::::::::::::::::::::::::::::::::::: -## Generating trajectories - -We have prepared our parameter values, contact matrix and demography vector. Now we must set the initial conditions, prepare the population and run the model. - -### Initial conditions +### 2. Initial conditions The initial conditions are the proportion of individuals in each disease state $S$, $E$, $I$ and $R$ for each age group at time 0. In this example, we have three age groups age between 0 and 20 years, age between 20 and 40 years and over. Let's assume that in the youngest age category, one in a million individuals are infectious, and the remaining age categories are infection free. @@ -377,25 +318,23 @@ initial_conditions_free <- c( We combine the three initial conditions vectors into one matrix, ```{r initial condtions} -# build for all age groups +# combine the initial conditions initial_conditions <- rbind( - initial_conditions_inf, - initial_conditions_free, - initial_conditions_free + initial_conditions_inf, # age group 1 + initial_conditions_free, # age group 2 + initial_conditions_free # age group 3 ) + +# use contact matrix to assign age group names rownames(initial_conditions) <- rownames(contact_matrix) initial_conditions ``` -### Running the model -To run the model we need the following inputs: -- an infection object, -- a population object, -- an optional number of time steps. -We have already created our infection object `influenza`. The population object requires a vector containing the demographic structure of the population. The demographic vector must be a named vector containing the number of individuals in each age group of our given population. In this example, we can extract the demographic information from the `contact_data` object that we obtained using the `socialmixr` package. +### 3. Population structure +The population object requires a vector containing the demographic structure of the population. The demographic vector must be a named vector containing the number of individuals in each age group of our given population. In this example, we can extract the demographic information from the `contact_data` object that we obtained using the `socialmixr` package. ```{r demography} demography_vector <- contact_data$demography$population @@ -414,42 +353,113 @@ uk_population <- population( ) ``` -No we are ready to run our model. We will specify `time_end=600` to run the model for 600 days. +### 4. Model parameters + +To run our model we need to specify the model parameters: + +- transmissibility $\beta$, +- infectiousness rate $\alpha$ (preinfectious period=$1/\alpha$), +- recovery rate $\gamma$ (infectious period=$1/\gamma$). + +In `epidemics`, we specify the model inputs as : + +- `transmissibility` = $R_0 \gamma$, +- `infectiousness_rate` = $\alpha$, +- `recovery_rate` = $\gamma$, + +We will simulate a strain of influenza with pandemic potential with $R_0=1.46$, a preinfectious period of 3 days and infectious period of 7 days. Therefore our inputs will be: + +- `transmissibility = 1.46 / 7.0`, +- `infectiousness_rate = 1.0 / 3.0`, +- `recovery_rate = 1.0 / 7.0`. + +::::::::::::::::::::::::::::::::::::: callout +### The basic reproduction number $R_0$ +The basic reproduction number, $R_0$, for the SEIR model is: + +$$ R_0 = \frac{\beta}{\gamma}.$$ + +Therefore, we can rewrite transmissibility $\beta$, as: + +$$ \beta = R_0 \gamma.$$ + + +:::::::::::::::::::::::::::::::::::::::::::::::: + + + + + +## Running the model + +::::::::::::::::::::::::::::::::::::: callout +### Running (solving) the model + +For models that are described by ODEs, running the model actually means to solve the system of ODEs. ODEs describe the rate of change in the disease states with respect to time, to find the number of individuals in each of these states, we use numerical integration methods to solve the equations. + +In `epidemics`, the [ODE solver](https://www.boost.org/doc/libs/1_82_0/libs/numeric/odeint/doc/html/index.htm) uses the [Runge-Kutta method](https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods). +:::::::::::::::::::::::::::::::::::::::::::::::: + +Now we are ready to run our model. To install the `epidemics` package : + +```{r installation, eval = FALSE} +if (!require("pak")) install.packages("pak") +pak::pak("epiverse-trace/epidemics") +``` + +Then we specify `time_end=600` to run the model for 600 days. ```{r run_model} output <- model_default_cpp( population = uk_population, - infection = influenza, - time_end = 600 + transmissibility = 1.46 / 7.0, + infectiousness_rate = 1.0 / 3.0, + recovery_rate = 1.0 / 7.0, + time_end = 600, increment = 1.0 ) head(output) ``` + +*Note : This model also has the functionality to include vaccination and tracks the number of vaccinated individuals through time. Even though we have not specified any vaccination, there is still a vaccinated compartment in the output (containing no individuals). We will cover the use of vaccination in future tutorials.* + Our model output consists of the number of individuals in each compartment in each age group through time. We can visualise the infectious individuals only (those in the $I$ class) through time. ```{r visualise, fig.width = 10} -ggplot(output[compartment == "infectious", ]) + +filter(output_plot, compartment %in% c("exposed", "infectious")) %>% + ggplot( + aes( + x = time, + y = value, + col = demography_group, + linetype = compartment + ) + ) + geom_line( - aes(time, value, colour = demography_group), - linewidth = 1 + linewidth = 1.2 + ) + + scale_y_continuous( + labels = scales::comma ) + scale_colour_brewer( palette = "Dark2", - labels = rownames(contact_matrix), name = "Age group" ) + - scale_y_continuous( - labels = scales::comma, - name = "Infectious indivduals" + expand_limits( + y = c(0, 500e3) ) + - labs( - x = "Model time (days)" + coord_cartesian( + expand = FALSE + ) + + theme_bw( + base_size = 15 ) + - theme_classic() + theme( legend.position = "top" ) + - theme_grey( - base_size = 15 + labs( + x = "Simulation time (days)", + linetype = "Compartment", + y = "Individuals" ) ``` @@ -457,13 +467,13 @@ ggplot(output[compartment == "infectious", ]) + ::::::::::::::::::::::::::::::::::::: callout ### Time increments -Note that there is a default argument of `increment = 1`. This relates to the time step of the ODE solver. When the parameters and maximum number of time steps is days, the default increment is one day. +Note that there is a default argument of `increment = 1`. This relates to the time step of the ODE solver. When the parameters are specified on a daily time-scale and maximum number of time steps (`time_end`) is days, the default time step of the ODE solver one day. The choice of increment will depend on the time scale of the parameters, and the rate at which events can occur. In general, the increment should smaller than the fastest event. For example, if parameters are on a monthly time scale, but some events will occur within a month, then the increment should be less than one month. :::::::::::::::::::::::::::::::::::::::::::::::: -### Accounting for uncertainty +## Accounting for uncertainty As the epidemic model is [deterministic](../learners/reference.md#deterministic), we have one trajectory for our given parameter values. In practice, we have uncertainty in the value of our parameters. To account for this, we must run our model for different parameter combinations. @@ -479,30 +489,20 @@ R0_vec <- rnorm(100, 1.5, 0.05) ```{r samples} output_samples <- Map( - R0_vec, + R0_vec, seq_along(R0_vec), f = function(x, i) { - # create infection object for R0 value - influenza <- infection( - name = "influenza", - r0 = x, - preinfectious_period = 3, - infectious_period = 7 - ) - # run an epidemic model using `epidemic()` output <- model_default_cpp( population = uk_population, - infection = influenza, + transmissibility = x / 7.0, + infectiousness_rate = 1.0 / 3.0, + recovery_rate = 1.0 / 7.0, time_end = 600, increment = 1.0 ) - # extract infectious individuals - output <- output[compartment == "infectious"] - - # assign scenario number - output[, c("scenario", "R") := list(i, x)] - + # add replicate number and return data + output$replicate <- x output } ) @@ -515,31 +515,40 @@ output_samples <- bind_rows(output_samples) 3. Calculate the mean and 95% quantiles of number of infectious individuals across each model simulation and visualise output ```{r plot, fig.width = 10} -ggplot(output_samples ,aes(time, value)) + +ggplot(output_samples[output_samples$compartment == "infectious", ], aes(time, value)) + stat_summary(geom = "line", fun = mean) + - stat_summary(geom = "ribbon", - fun.min = function(z) { quantile(z, 0.025) }, - fun.max = function(z) { quantile(z, 0.975) }, - alpha = 0.3) + + stat_summary( + geom = "ribbon", + fun.min = function(z) { + quantile(z, 0.025) + }, + fun.max = function(z) { + quantile(z, 0.975) + }, + alpha = 0.3 + ) + facet_grid( cols = vars(demography_group) ) + - theme_grey( + labs( + x = "Simulation time (days)", + y = "Individuals" + ) + + theme_bw( base_size = 15 ) ``` -Deciding which parameters to include uncertainty in depends on a few factors: how well informed a parameter value is e.g. consistency of estimates from the literature; how sensitive model outputs are to parameter value changes; and the purpose of the modelling task. +Deciding which parameters to include uncertainty in depends on a few factors: how well informed a parameter value is e.g. consistency of estimates from the literature; how sensitive model outputs are to parameter value changes; and the purpose of the modelling task. See [McCabe et al. 2021](https://doi.org/10.1016%2Fj.epidem.2021.100520) to learn about different types of uncertainty in infectious disease modelling. ## Summary -In this tutorial, we have learnt how to generate disease trajectories using a mathematical model. Once a model has been chosen, the parameters and other inputs must be specified in the correct way to perform model simulations. In the next tutorial, we will consider how to choose the right model for different tasks. +In this tutorial, we have learnt how to simulate disease spread using a mathematical model. Once a model has been chosen, the parameters and other inputs must be specified in the correct way to perform model simulations. In the next tutorial, we will consider how to choose the right model for different tasks. ::::::::::::::::::::::::::::::::::::: keypoints - Disease trajectories can be generated using the R package `epidemics` -- Contact matrices can be estimated from different sources -- Include uncertainty in model trajectories +- Uncertainty should be included in model trajectories using a range of model parameter values :::::::::::::::::::::::::::::::::::::::::::::::: diff --git a/learners/reference.md b/learners/reference.md index 5cec1299..14c8148d 100644 --- a/learners/reference.md +++ b/learners/reference.md @@ -9,7 +9,13 @@ title: 'Glossary of Terms: Epiverse-TRACE' - +## C + +[Contact matrix]{#contact} +: The contact matrix is a square matrix consisting of rows/columns equal to the number age groups. Each element represents the frequency of contacts between age groups. If we believe that transmission of an infection is driven by contact, and that contact rates are very different for different age groups, then specifying a contact matrix allows us to account for age specific rates of transmission. + +[C++]{#cplusplus} +: C++ is a high-level programming language that can be used within R to speed up sections of code. To learn more about C++ check out these [tutorials](https://cplusplus.com/doc/tutorial/) and learn more about the integration of C++ and R [here](https://www.rcpp.org/). ## D @@ -30,11 +36,15 @@ title: 'Glossary of Terms: Epiverse-TRACE' ## I [Incubation period]{#incubation} -: The time between becoming infected and the onset of infectiousness, same as [latent period](#latent). +: The time between becoming infected and the onset of symptoms. [More information on the incubation period](https://en.wikipedia.org/wiki/Latent_period_(epidemiology)#Incubation_period). [Indirect transmission]{#indirect} : Indirectly transmitted infections are passed on to humans via contact with vectors, animals or contaminated environment. Vector-borne infections, zoonoses and water-borne infections are modelled as indirectly transmitted. +[Initial conditions]{#initial} +: In [ODEs](#ordinary), the initial conditions are the values of the state variables at the start of the model simulation (at time 0). For example, if there is one infectious individual in a population of 1000 in an Susceptible-Infectious-Recovered model, the initial conditions would be $S(0) = 999$, $I(0) = 1$, $R(0) = 0$. + + @@ -42,14 +52,21 @@ title: 'Glossary of Terms: Epiverse-TRACE' ## L [Latent period]{#latent} -: The time between becoming infected and the onset of infectiousness, same as [incubation period](#incubation). +: The time between becoming infected and the onset of infectiousness. [More information on the latent period](https://en.wikipedia.org/wiki/Latent_period_(epidemiology)). - +## M +[Model parameters (ODEs)]{#parsode} +: The model parameters are used in [ordinary differential equation](#ordinary) models to describe the flow between disease states. For example, a transmission rate $\beta$ is a model parameter that can be used to describe the flow between susceptible and infectious states. - - +## N +[Non-pharmaceutical interventions]{#NPIs} +: Non-pharmaceutical interventions (NPIs) are measures put in place to reduce transmission that do not include the administration of drugs or vaccinations. [More information on NPIs](https://www.gov.uk/government/publications/technical-report-on-the-covid-19-pandemic-in-the-uk/chapter-8-non-pharmaceutical-interventions). + +## O +[Ordinary differential equations]{#ordinary} +: Ordinary differential equations (ODEs) can be used to represent the rate of change of one variable (e.g. number of infected individuals) with respect to another (e.g. time). Check out this introduction to [ODEs](https://mathinsight.org/ordinary_differential_equation_introduction). ODEs are widely used in infectious disease modelling to model the flow of individuals between different disease states. @@ -59,6 +76,9 @@ title: 'Glossary of Terms: Epiverse-TRACE' ## S +[State variables]{#state} +: The state variables in a model represented by [ordinary differential equations](#ordinary) are the disease states that individuals can be in e.g. if individuals can be susceptible, infectious or recovered the state variables are $S$, $I$ and $R$. There is an ordinary differential equation for each state variable. + [Stochastic model]{#stochastic} : A model that includes some stochastic process resulting in variation in model simulations for the same initial conditions and parameter values. Examples include stochastic differential equations and branching process models. For more detail see [Allen (2017)](https://doi.org/10.1016/j.idm.2017.03.001). @@ -73,6 +93,7 @@ title: 'Glossary of Terms: Epiverse-TRACE' : Vector-borne transmission means an infection can be passed from a vector (e.g. mosquitoes) to humans. Examples of vector-borne diseases include malaria and dengue. The World Health Organization have a [Fact sheet about Vector-borne diseases](https://www.who.int/news-room/fact-sheets/detail/vector-borne-diseases) with key information and a list of them according to their vector. + diff --git a/renv/profiles/lesson-requirements/renv.lock b/renv/profiles/lesson-requirements/renv.lock index b87ec9ca..09b8a749 100644 --- a/renv/profiles/lesson-requirements/renv.lock +++ b/renv/profiles/lesson-requirements/renv.lock @@ -567,24 +567,23 @@ "Source": "GitHub", "RemoteType": "github", "RemoteHost": "api.github.com", - "RemoteRepo": "epidemics", "RemoteUsername": "epiverse-trace", - "RemotePkgRef": "epiverse-trace/epidemics", - "RemoteRef": "HEAD", - "RemoteSha": "6004c3a7e50be7b127070c4e96a011630307df17", + "RemoteRepo": "epidemics", + "RemoteRef": "main", + "RemoteSha": "5f6825d7ee9fbd162b26db95bc2f3e0d3054f452", "Requirements": [ "BH", "Rcpp", "RcppEigen", "checkmate", + "cli", "data.table", "deSolve", "glue", - "jsonlite", "stats", "utils" ], - "Hash": "96a7b1457b8a5d89f146d0abf3de7eff" + "Hash": "5abca47c2b33eb1a73a412a045823a8f" }, "evaluate": { "Package": "evaluate", @@ -1179,18 +1178,6 @@ ], "Hash": "2a0dc8c6adfb6f032e4d4af82d258ab5" }, - "pak": { - "Package": "pak", - "Version": "0.7.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "tools", - "utils" - ], - "Hash": "17c4c32b4bdc087508f35e8b7dcf4191" - }, "pillar": { "Package": "pillar", "Version": "1.9.0", @@ -1628,7 +1615,7 @@ "Package": "systemfonts", "Version": "1.0.5", "Source": "Repository", - "Repository": "RSPM", + "Repository": "https://carpentries.r-universe.dev", "Requirements": [ "R", "cpp11" @@ -1668,7 +1655,7 @@ "Package": "textshaping", "Version": "0.3.7", "Source": "Repository", - "Repository": "RSPM", + "Repository": "https://carpentries.r-universe.dev", "Requirements": [ "R", "cpp11",