Perform dictionary-based cleaning
Arguments
- data
 The input
<data.frame>or<linelist>- dictionary
 A
<data.frame>with the dictionary associated with the input data. This is expected to be compatible with the matchmaker package and must contain the following four columns:optionsThis column contains the current values used to represent the different groups in the input data frame (required).
valuesThe values that will be used to replace the current options (required).
grpThe name of the columns where every option belongs to (required).
ordersThis defines the user-defined order of different options (optional).
Value
A <data.frame> or <linelist> where the target options
have been replaced with their corresponding values in the columns
specified in the data dictionary.
Examples
data <- readRDS(
  system.file("extdata", "messy_data.RDS", package = "cleanepi")
)
dictionary <- readRDS(
  system.file("extdata", "test_dict.RDS", package = "cleanepi")
)
# adding an option that is not defined in the dictionary to the 'gender'
# column
data$gender[2] <- "homme"
cleaned_df <- clean_using_dictionary(
  data = data,
  dictionary = dictionary
)
#> ! Cannot replace "homme" present in column gender but not defined in the dictionary.
#> ℹ You can either:
#> • correct the misspelled option from the input data, or
#> • add it to the dictionary using the `add_to_dictionary()` function.
# print the report
print_report(cleaned_df, "misspelled_values")
#>   idx column value
#> 1   2 gender homme