Standardize column names of a data frame or line list — standardize_column

All columns names will be reformatted to snake_case. When the conversion to snakecase does not work as expected, use the keep and/or rename arguments to reformat the column name properly.

Usage

standardize_column_names(data, keep = NULL, rename = NULL)

Arguments

data: The input <data.frame> or <linelist>.
keep: A <vector> of column names to maintain as they are. When dealing with a <linelist>, this can be set to linelist_tags, to maintain the tagged column names. The Default is NULL.
rename: A named <vector> of column names to be renamed. This should be in the form of c(new_name1 = "old_name1", new_name2 = "old_name2") for example.

Value

A <data.frame> or <linelist> with easy to work with column names.

Examples

# do not rename 'date.of.admission'
cleaned_data <- standardize_column_names(
  data = readRDS(
    system.file("extdata", "test_df.RDS", package = "cleanepi")
  ),
  keep = "date.of.admission"
)

# do not rename 'date.of.admission', but rename 'dateOfBirth' and 'sex' to
# 'DOB' and 'gender' respectively
cleaned_data <- standardize_column_names(
  data = readRDS(
    system.file("extdata", "test_df.RDS", package = "cleanepi")
  ),
  keep = "date.of.admission",
  rename = c(DOB = "dateOfBirth", gender = "sex")
)

# print the report
print_report(
  data = cleaned_data,
  what = "colnames"
)
#>                         before                        after
#> 1                     study_id                     study_id
#> 2                   event_name                   event_name
#> 3                 country_code                 country_code
#> 4                 country_name                 country_name
#> 5            date.of.admission            date.of.admission
#> 6                  dateOfBirth                          DOB
#> 7 date_first_pcr_positive_test date_first_pcr_positive_test
#> 8                          sex                       gender