Skip to contents

It generates independent transmission chains starting with a single case per chain, using a simple branching process model (See details for definition of "chains" and assumptions). Offspring for each chain are generated with an offspring distribution, and an optional generation time distribution function.

The individual chain simulations are controlled by customisable stopping criteria, including a threshold chain size or length, and a generation time cut off. The function also optionally accepts population related inputs such as the population size (defaults to Inf) and percentage of the population initially immune (defaults to 0).

Usage

simulate_chains(
  n_chains,
  statistic = c("size", "length"),
  offspring_dist,
  ...,
  stat_threshold = Inf,
  pop = Inf,
  percent_immune = 0,
  generation_time = NULL,
  t0 = 0,
  tf = Inf
)

Arguments

n_chains

Number of chains to simulate.

statistic

The chain statistic to track as the stopping criteria for each chain being simulated when stat_threshold is not Inf; A <string>. It can be one of:

  • "size": the total number of cases produced by a chain before it goes extinct.

  • "length": the total number of generations reached by a chain before it goes extinct.

offspring_dist

Offspring distribution: a <function> like the ones provided by R to generate random numbers from given distributions (e.g., rpois for Poisson). More specifically, the function needs to accept at least one argument, n, which is the number of random numbers to generate. It can accept further arguments, which will be passed on to the random number generating functions. Examples that can be provided here are rpois for Poisson distributed offspring, rnbinom for negative binomial offspring, or custom functions.

...

Parameters of the offspring distribution as required by R.

stat_threshold

A stopping criterion for individual chain simulations; a positive number coercible to integer. When any chain's cumulative statistic reaches or surpasses stat_threshold, that chain ends. Defaults to Inf. For example, if statistic = "size" and stat_threshold = 10, then any chain that produces 10 or more cases will stop. Note that setting stat_threshold does not guarantee that all chains will stop at the same value.

pop

Population size; An <Integer>. Used alongside percent_immune to define the susceptible population. Defaults to Inf.

percent_immune

Percent of the population immune to infection at the start of the simulation; A <numeric> between 0 and 1. Used alongside pop to initialise the susceptible population. Defaults to 0.

generation_time

The generation time function; the name of a user-defined named or anonymous function with only one argument n, representing the number of generation times to sample.

t0

Start time (if generation time is given); either a single value or a vector of same length as n_chains (number of initial cases) with corresponding initial times. Defaults to 0, meaning all cases started at time 0.

tf

A number for the cut-off for the infection times (if generation time is given); Defaults to Inf.

Value

An <epichains> object, which is basically a <data.frame> with columns:

  • chain - an ID for active/ongoing chains,

  • infectee - a unique ID for each infectee.

  • infector - an ID for the infector of each infectee.

  • generation - a discrete time point during which infection occurs, and optionally,

  • time - the time of infection.

Definition of a transmission chain

A transmission chain as used here is an independent case and all the secondary cases linked to it through transmission. The chain starts with a single case, and each case in the chain generates secondary cases according to the offspring distribution. The chain ends when no more secondary cases are generated.

Calculating chain sizes and lengths

The function simulates the chain size for chain \(i\) at time \(t\), \(I_{t, i}\), as: $$I_{t, i} = \sum_{i}^{I_{t-1}}X_{t, i},$$ and the chain length/duration for chain \(i\) at time \(t\), \(L_{t, i}\), as: $$L_{t, i} = {\sf min}(1, X_{t, i}), $$ where \(X_{t, i}\) is the secondary cases generated by chain \(i\) at time \(t\), and \(I_{0, i} = L_{0, i} = 1\).

The distribution of secondary cases, \(X_{t, i}\) is modelled by the offspring distribution (offspring_dist).

Specifying generation_time

The argument generation_time must be specified as a function with one argument, n.

For example, assuming we want to specify the generation time as a random log-normally distributed variable with meanlog = 0.58 and sdlog = 1.58, we could define a named function, let's call it "generation_time_fn", with only one argument representing the number of generation times to sample: generation_time_fn <- function(n){rlnorm(n, 0.58, 1.38)}, and assign the name of the function to generation_time in the simulation function, i.e. `simulate_*`(..., generation_time = generation_time_fn), where ... are the other arguments to simulate_*() and * is a placeholder for the rest of simulation function's name.

Alternatively, we could assign an anonymous function to generation_time in the simulate_*() call, i.e. simulate_*(..., generation_time = function(n){rlnorm(n, 0.58, 1.38)}) OR simulate_*(..., generation_time = \(n){rlnorm(n, 0.58, 1.38)}), where ... are the other arguments to simulate_*().

References

Jacob C. (2010). Branching processes: their role in epidemiology. International journal of environmental research and public health, 7(3), 1186–1204. doi:10.3390/ijerph7031204

Blumberg, S., and J. O. Lloyd-Smith. 2013. "Comparing Methods for Estimating R0 from the Size Distribution of Subcritical Transmission Chains." Epidemics 5 (3): 131–45. doi:10.1016/j.epidem.2013.05.002 .

Farrington, C. P., M. N. Kanaan, and N. J. Gay. 2003. "Branching Process Models for Surveillance of Infectious Diseases Controlled by Mass Vaccination.” Biostatistics (Oxford, England) 4 (2): 279–95. doi:10.1093/biostatistics/4.2.279 .

Author

James M. Azam, Sebastian Funk

Examples

# Using a Poisson offspring distribution and simulating from an infinite
# population up to chain size 10.
set.seed(32)
chains_pois_offspring <- simulate_chains(
  n_chains = 10,
  statistic = "size",
  offspring_dist = rpois,
  stat_threshold = 10,
  generation_time = function(n) rep(3, n),
  lambda = 2
)
chains_pois_offspring
#> `<epichains>` object
#> 
#> < epichains head (from first known infector) >
#> 
#>    chain infector infectee generation time
#> 11     1        1        2          2    3
#> 12     1        1        3          2    3
#> 13     2        1        2          2    3
#> 14     2        1        3          2    3
#> 15     3        1        2          2    3
#> 16     3        1        3          2    3
#> 
#> 
#> Number of chains: 10
#> Number of infectors (known): 9
#> Number of generations: 4
#> Use `as.data.frame(<object_name>)` to view the full output in the console.

# Using a Negative binomial offspring distribution and simulating from a
# finite population up to chain size 10.
set.seed(32)
chains_nbinom_offspring <- simulate_chains(
  n_chains = 10,
  pop = 100,
  percent_immune = 0,
  statistic = "size",
  offspring_dist = rnbinom,
  stat_threshold = 10,
  generation_time = function(n) rep(3, n),
  mu = 2,
  size = 0.2
)
chains_nbinom_offspring
#> `<epichains>` object
#> 
#> < epichains head (from first known infector) >
#> 
#>    chain infector infectee generation time
#> 11     1        1        2          2    3
#> 12     2        1        2          2    3
#> 13     2        1        3          2    3
#> 14     2        1        4          2    3
#> 15     2        1        5          2    3
#> 16     2        1        6          2    3
#> 
#> 
#> Number of chains: 10
#> Number of infectors (known): 5
#> Number of generations: 4
#> Use `as.data.frame(<object_name>)` to view the full output in the console.