Instructor Notes
Instructor notes
Introduction to outbreak analytics
Instructor Note
Useful concepts maps to teach this episode are
- https://github.com/rstudio/concept-maps?tab=readme-ov-file#dplyr
- https://github.com/rstudio/concept-maps?tab=readme-ov-file#pipe-operator
- https://github.com/rstudio/concept-maps?tab=readme-ov-file#pivoting
Instructor Note
The information to collect will depend on the questions we need to give a response.
At the beginning of an outbreak, we need data to give a response to questions like:
- How fast does an epidemic grow?
- What is the risk of death?
- How many cases can I expect in the coming days?
Informative indicators are:
- growth rate, reproduction number.
- case fatality risk, hospitalization fatality risk.
- projection or forecast of cases.
Useful data are:
- date of onset, date of death.
- delays from infection to onset, from onset to death.
- percentage of observations detected by surveillance system.
- subject characteristics to stratify the analysis by person, place, time.
Instructor Note
- date of infection: mostly unknown, depended on limited coverage of contact tracing or outbreak research, and sensitive to recall bias from subjects.
- date of outcome: reporting delay
Instructor Note
Keeping cases with missing outcome is useful to track the incidence of number of new cases, useful to assess transmission.
However, when assessing severity, CFR estimation is sensitive to:
Right-censoring bias. If we include observations with unknown final status we can underestimate the true CFR.
Selection bias. At the beginning of an outbreak, given that health systems collect most clinically severe cases, an early estimate of the CFR can overestimate the true CFR.
Instructor Note
Close inspection of the line list shows that the last date of any entry (by date of hospitalization) is a bit later than the last date of symptom onset.
From the cases
object we can use:
-
dplyr::summarise()
to summarise each group down to one row, -
base::max
to calculate the maximum dates of onset and hospitalisation.
When showcasing this to learners in a live coding session, you can
also use cases %>% view()
to rearrange by date
columns.
R
cases %>%
dplyr::summarise(
max_onset = max(date_of_onset),
max_hospital = max(date_of_hospitalisation)
)
OUTPUT
# A tibble: 1 × 2
max_onset max_hospital
<date> <date>
1 2014-06-27 2014-07-07
Instructor Note
Assess learners based on video refreshers on distributions, likelihood, and maximum likelihood from setup instructions.