Skip to contents

Checks whether a date sequence in a vector of specified columns is in chronological order or not.

Usage

check_date_sequence(data, target_columns)

Arguments

data

The input <data.frame> or <linelist>

target_columns

A <vector> of column names for events. Users should specify at least 2 column names in the expected order. For example: target_columns = c("date_symptoms_onset", "date_hospitalization", "date_death"). When the input data is a <linelist> object, this parameter can be set to linelist_tags to apply the date sequence checking exclusively to the tagged columns. The date values in the target columns should be in the ISO8601 format, e.g., 2024-12-31. Otherwise, use the standardize_dates() function to standardize the target columns.

Value

The input dataset. When found, the incorrect date sequences will be stored in the report and can be accessed using the print_report() function as shown in the example below.

Examples

# import the data
data <- readRDS(system.file("extdata", "test_df.RDS", package = "cleanepi"))

# standardize the date values
data <- data %>%
  standardize_dates(
    target_columns  = c("date_first_pcr_positive_test", "date.of.admission"),
    error_tolerance = 0.4,
    format = NULL,
    timeframe = NULL
  )
#> ! Detected 4 values that comply with multiple formats and no values that are
#>   outside of the specified time frame.
#>  Enter `print_report(data = dat, "date_standardization")` to access them,
#>   where "dat" is the object used to store the output from this operation.

# check whether all admission dates come after the test dates
good_date_sequence <- check_date_sequence(
  data = data,
  target_columns = c("date_first_pcr_positive_test", "date.of.admission")
)
#> ! Detected 2 incorrect date sequences at lines: "6, 8".
#>  Enter `print_report(data = dat, "incorrect_date_sequence")` to access them,
#>   where "dat" is the object used to store the output from this operation.

# display rows where admission dates do not come after the test dates
print_report(
  data = good_date_sequence,
  what = "incorrect_date_sequence"
)
#>   date_first_pcr_positive_test date.of.admission row_id
#> 6                   2021-05-02        2021-02-17      6
#> 8                   2021-09-20        2021-02-22      8