superspreading is an R package that provides a set of functions to estimate and understand individuallevel variation the in transmission of infectious diseases from data on secondary cases.
superspreading implements methods outlined in LloydSmith et al. (2005), Adam J. Kucharski et al. (2020), and Kremer et al. (2021), as well as additional functions.
superspreading is developed at the Centre for the Mathematical Modelling of Infectious Diseases at the London School of Hygiene and Tropical Medicine as part of EpiverseTRACE.
Installation
The easiest way to install the development version of superspreading from GitHub is to use the pak package:
# check whether {pak} is installed
if(!require("pak")) install.packages("pak")
pak::pak("epiversetrace/superspreading")
Quick start
Calculate the heterogeneity of transmission
Case study using data from early Ebola outbreak in Guinea in 2014, stratified by index and nonindex cases, as in Adam J. Kucharski et al. (2016). Data on transmission from index and secondary cases for Ebola in 2014.
Source: Faye et al. (2015) & Althaus (2015).
{fitdistrplus}
is a welldeveloped and stable R package that provides a variety of methods for fitting distribution models to data (DelignetteMuller and Dutang 2015). Therefore, it is used throughout the documentation of superspreading and is a recommended package for those wanting to fit distribution models, for example those supplied in superspreading (Poissonlognormal and PoissonWeibull). We recommend reading the fitdistrplus documentation (specifically ?fitdist
) to explore the full range of functionality.
In this example we fit the negative binomial distribution to estimate the reproduction number (R, which is the mean of the distribution) and the dispersion (k, which a measure of the variance of the distribution). The parameters are estimated via maximum likelihood (the default method for fitdist()
).
# we use {fitdistrplus} to fit the models
library(fitdistrplus)
#> Loading required package: MASS
#> Loading required package: survival
# transmission events from index cases
index_case_transmission < c(2, 17, 5, 1, 8, 2, 14)
# transmission events from secondary cases
secondary_case_transmission < c(
1, 2, 1, 4, 4, 1, 3, 3, 1, 1, 4, 9, 9, 1, 2, 1, 1, 1, 4, 3, 3, 4, 2, 5,
1, 2, 2, 1, 9, 1, 3, 1, 2, 1, 1, 2
)
# Format data into index and nonindex cases
# total nonindex cases
n_non_index < sum(c(index_case_transmission, secondary_case_transmission))
# transmission from all nonindex cases
non_index_cases < c(
secondary_case_transmission,
rep(0, n_non_index  length(secondary_case_transmission))
)
# Estimate R and k for index and nonindex cases
param_index < fitdist(data = index_case_transmission, distr = "nbinom")
# rename size and mu to k and R
names(param_index$estimate) < c("k", "R")
param_index$estimate
#> k R
#> 1.596646 7.000771
param_non_index < fitdist(data = non_index_cases, distr = "nbinom")
# rename size and mu to k and R
names(param_non_index$estimate) < c("k", "R")
param_non_index$estimate
#> k R
#> 0.1937490 0.6619608
The reproduction number (R) is higher for index cases than for nonindex cases, but the heterogeneity in transmission is higher for nonindex cases (i.e. k is lower).
Calculate the probability of a large epidemic
Given the reproduction number (R) and the dispersion (k), the probability that a infectious disease will cause an epidemic, in other words the probability it does not go extinct, can be calculated using probability_epidemic()
. Here we use probability_epidemic()
for the parameters estimated in the above section for Ebola, assuming there are three initial infections seeding the potential outbreak.
# Compare probability of a large outbreak when k varies according to
# index/nonindex values, assuming 3 initial spillover infections
initial_infections < 3
# Probability of an epidemic using k estimated from index cases
probability_epidemic(
R = param_index$estimate[["R"]],
k = param_index$estimate[["k"]],
num_init_infect = initial_infections
)
#> [1] 0.9995781
# Probability of an epidemic using k estimated from nonindex cases
probability_epidemic(
R = param_non_index$estimate[["R"]],
k = param_non_index$estimate[["k"]],
num_init_infect = initial_infections
)
#> [1] 0
The probability of causing a sustained outbreak is high for the index cases, but is zero for nonindex cases (i.e. disease transmission will inevitably cease assuming transmission dynamics do not change).
Package vignettes
More details on how to use superspreading can be found in the online documentation as package vignettes, under “Articles”.
Visualisation and plotting functionality
superspreading does not provide plotting functions, instead we provide example code chunks in the package’s vignettes that can be used as a templates upon which data visualisations can be modified. We recommend users copy and edit the examples for their own purposes. (This is permitted under the package’s MIT license). By default code chunks for plotting are folded, in order to unfold them and see the code simply click the code button at the top left of the plot.
Help
To report a bug please open an issue
Contribute
Contributions to superspreading are welcomed. Please follow the package contributing guide.
Code of Conduct
Please note that the {superspreading} project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Citing this package
citation("superspreading")
#> To cite package 'superspreading' in publications use:
#>
#> Lambert J, Kucharski A (2024). _superspreading: Estimate
#> IndividualLevel Variation in Transmission_. R package version
#> 0.1.0.9000, https://epiversetrace.github.io/superspreading/,
#> <https://github.com/epiversetrace/superspreading>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {superspreading: Estimate IndividualLevel Variation in Transmission},
#> author = {Joshua W. Lambert and Adam Kucharski},
#> year = {2024},
#> note = {R package version 0.1.0.9000,
#> https://epiversetrace.github.io/superspreading/},
#> url = {https://github.com/epiversetrace/superspreading},
#> }
Related projects
This project has some overlap with other R packages:

{bpmodels}
is another EpiverseTRACE R package that analyses transmission chain data to infer the transmission process for either the size or length of transmission chains. Two main differences between the packages are: 1) superspreading has more functions to compute metrics that characterise outbreaks and superspreading events (e.g.probability_epidemic()
&probability_extinct()
); 2) bpmodels can simulate a branching process (chain_sim()
) with a specified process (e.g. negative binomial).