This function converts a data.frame
or a tibble
into a linelist
object,
where different types of epidemiologically relevant data are tagged. This
includes dates of different events (e.g. onset of symptoms, case reporting),
information on the patient (e.g. age, gender, location) as well as other
informations such as the type of case (e.g. confirmed, probable) or the
outcome of the disease. The output will seem to be the same data.frame
, but
linelist
-aware packages will then be able to automatically use tagged
fields for further data cleaning and analysis.
Arguments
- x
a
data.frame
or atibble
containing case line list data, with cases in rows and variables in columns- ...
a series of tags provided as
tag_name = "column_name"
, wheretag_name
indicates any of the known variables listed in 'Details'; alternatively, a namedlist
of variables to be tagged, where names indicate the types of variable (to be selected fromtags_names()
), and values indicate their name in the inputdata.frame
; see details for a list of known variable types and their expected content- allow_extra
a
logical
indicating if additional data tags not currently recognized bylinelist
should be allowed; ifFALSE
, unknown tags will trigger an error
Details
Known variable types include:
id
: a unique case identifier asnumeric
orcharacter
date_onset
: date of symptom onset (see below for date formats)date_reporting
: date of case notification (see below for date formats)date_admission
: date of hospital admission (see below for date formats)date_discharge
: date of hospital discharge (see below for date formats)date_outcome
: date of disease outcome (see below for date formats)date_death
: date of death (see below for date formats)gender
: afactor
orcharacter
indicating the gender of the patientage
: anumeric
indicating the age of the patient, in yearslocation
: afactor
orcharacter
indicating the location of the patientoccupation
: afactor
orcharacter
indicating the professional activity of the patienthcw
: alogical
indicating if the patient is a health care workeroutcome
: afactor
orcharacter
indicating the outcome of the disease (death or survival)
Dates can be provided in the following formats/types:
Date
objects (e.g. usingas.Date
on acharacter
with a correct date format); this is the recommended formatPOSIXct/POSIXlt
objects (when a finer scale than days is needed)numeric
values, typically indicating the number of days since the first case
See also
An overview of the linelist package *
tags_names()
: for a list of known tag names *tags_types()
: for the associated accepted types/classes *tags()
: for a list of tagged variables in alinelist
*set_tags()
: for modifying tags *tags_df()
: for selecting variables by tags
Author
Thibaut Jombart thibaut@data.org
Examples
if (require(outbreaks)) {
## dataset we will convert to linelist
head(measles_hagelloch_1861)
## create linelist
x <- make_linelist(measles_hagelloch_1861,
id = "case_ID",
date_onset = "date_of_prodrome",
age = "age",
gender = "gender"
)
## print result - just first few entries
head(x)
## check tags
tags(x)
}
#> $id
#> [1] "case_ID"
#>
#> $date_onset
#> [1] "date_of_prodrome"
#>
#> $gender
#> [1] "gender"
#>
#> $age
#> [1] "age"
#>