This function converts a data.frame or a tibble into a linelist object,
where different types of epidemiologically relevant data are tagged. This
includes dates of different events (e.g. onset of symptoms, case reporting),
information on the patient (e.g. age, gender, location) as well as other
information such as the type of case (e.g. confirmed, probable) or the
outcome of the disease. The output will seem to be the same data.frame, but
linelist-aware packages will then be able to automatically use tagged
fields for further data cleaning and analysis.
Arguments
- x
a
data.frameor atibblecontaining case line list data, with cases in rows and variables in columns- ...
<
dynamic-dots> A series of tags provided astag_name = "column_name", wheretag_nameindicates any of the known variables listed in 'Details' and values indicate their name inx; see details for a list of known variable types and their expected content- allow_extra
a
logicalindicating if additional data tags not currently recognized bylinelistshould be allowed; ifFALSE, unknown tags will trigger an error
Details
Known variable types include:
id: a unique case identifier asnumericorcharacterdate_onset: date of symptom onset (see below for date formats)date_reporting: date of case notification (see below for date formats)date_admission: date of hospital admission (see below for date formats)date_discharge: date of hospital discharge (see below for date formats)date_outcome: date of disease outcome (see below for date formats)date_death: date of death (see below for date formats)gender: afactororcharacterindicating the gender of the patientage: anumericindicating the age of the patient, in yearslocation: afactororcharacterindicating the location of the patientoccupation: afactororcharacterindicating the professional activity of the patienthcw: alogicalindicating if the patient is a health care workeroutcome: afactororcharacterindicating the outcome of the disease (death or survival)
Dates can be provided in the following formats/types:
Dateobjects (e.g. usingas.Dateon acharacterwith a correct date format); this is the recommended formatPOSIXct/POSIXltobjects (when a finer scale than days is needed)numericvalues, typically indicating the number of days since the first case
See also
An overview of the linelist package
tags_names(): for a list of known tag namestags_types(): for the associated accepted types/classestags(): for a list of tagged variables in alinelistset_tags(): for modifying tagstags_df(): for selecting variables by tags
Examples
if (require(outbreaks)) {
## dataset we will convert to linelist
head(measles_hagelloch_1861)
## create linelist
x <- make_linelist(measles_hagelloch_1861,
id = "case_ID",
date_onset = "date_of_prodrome",
age = "age",
gender = "gender"
)
## print result - just first few entries
head(x)
## check tags
tags(x)
## Tags can also be passed as a list with the splice operator (!!!)
my_tags <- list(
id = "case_ID",
date_onset = "date_of_prodrome",
age = "age",
gender = "gender"
)
new_x <- make_linelist(measles_hagelloch_1861, !!!my_tags)
## The output is strictly equivalent to the previous one
identical(x, new_x)
}
#> [1] TRUE