Package index
Exported functions
cleanepi functions available to end-users
-
add_to_dictionary() - Add an element to the data dictionary
-
add_to_report() - Add an element to the report object
-
check_date_sequence() - Checks whether the order in a sequence of date events is chronological. order.
-
check_subject_ids() - Check whether the subject IDs comply with the expected format. When incorrect IDs are found, the function sends a warning and the user can call the
correct_subject_idsfunction to correct them.
-
clean_data() - Clean and standardize data
-
clean_using_dictionary() - Perform dictionary-based cleaning
-
common_na_strings - Common strings representing missing values
-
convert_numeric_to_date() - Convert numeric to date
-
convert_to_numeric() - Convert columns into numeric
-
correct_misspelled_values() - Correct misspelled values by using approximate string matching techniques to compare them against the expected values.
-
correct_subject_ids() - Correct the wrong subject IDs based on the user-provided values.
-
find_duplicates() - Identify and return duplicated rows in a data frame or linelist.
-
get_default_params() - Set and return
clean_datadefault parameters
-
print_report() - Generate report from data cleaning operations
-
remove_constants() - Remove constant data, including empty rows, empty columns, and columns with constant values.
-
remove_duplicates() - Remove duplicates
-
replace_missing_values() - Replace missing values with
NA
-
scan_data() - Scan through a data frame and return the proportion of
missing,numeric,Date,character,logicalvalues.
-
standardize_column_names() - Standardize column names of a data frame or line list
-
standardize_dates() - Standardize date variables
-
timespan() - Calculate time span between dates
-
get_target_column_names() - Get the names of the columns from which duplicates will be found
-
add_to_report() - Add an element to the report object
-
numbers_only() - Detects whether a string contains only numbers or not.
-
retrieve_column_names() - Get column names
-
tr_() - Flag out what message will be translated using the potools package
-
clean_data() - Clean and standardize data
-
scan_data() - Scan through a data frame and return the proportion of
missing,numeric,Date,character,logicalvalues.
-
scan_in_character() - Scan through a character column
-
print_report() - Generate report from data cleaning operations
-
standardize_column_names() - Standardize column names of a data frame or line list
-
make_unique_column_names() - Make column names unique when duplicated column names are found after the transformation
-
convert_numeric_to_date() - Convert numeric to date
-
convert_to_numeric() - Convert columns into numeric
-
detect_to_numeric_columns() - Detect the numeric columns that appears as characters due to the presence of some character values in the column.
-
standardize_dates() - Standardize date variables
-
date_check_outsiders() - Convert and update date values
-
date_check_timeframe() - Check date time frame
-
date_choose_first_good() - Choose the first non-missing date from a data frame of dates
-
date_convert() - Convert characters to dates
-
date_detect_complex_format() - Detect complex date format
-
date_detect_day_or_month() - Detect the appropriate abbreviation for day or month value
-
date_detect_format() - Detect a date format with only 1 separator
-
date_detect_separator() - Detect the special character that is the separator in the date values
-
date_detect_simple_format() - Get format from a simple Date value
-
date_get_format() - Infer date format from a vector or characters
-
date_get_part1() - Split a string based on a pattern and return the first element of the resulting vector.
-
date_get_part2() - Get part2 of date value
-
date_get_part3() - Get part3 of date value
-
date_guess() - Try and guess dates from a characters
-
date_guess_convert() - Guess if a character vector contains Date values, and convert them to date
-
date_i_guess_and_convert() - Extract date from a character vector
-
date_make_format() - Build the auto-detected format
-
date_match_format_and_column() - Check whether the number of provided formats matches the number of target columns to be standardized.
-
date_process() - Process date variable
-
date_rescue_lubridate_failures() - Find the dates that lubridate couldn't
-
date_trim_outliers() - Trim dates outside of the defined timeframe
Dictionary-based substitution
Substitutes specified options in data frame columns with their corresponding values
-
dictionary_make_metadata() - Make data dictionary for 1 field
-
add_to_dictionary() - Add an element to the data dictionary
-
clean_using_dictionary() - Perform dictionary-based cleaning
-
construct_misspelled_report() - Build the report for the detected misspelled values during dictionary-based data cleaning operation
-
detect_misspelled_options() - Detect misspelled options in columns to be cleaned
-
print_misspelled_values() - Print the detected misspelled values
Check spelling mistakes
Substitutes misspelled values with their closest character from a user- provided vector of words
-
correct_misspelled_values() - Correct misspelled values by using approximate string matching techniques to compare them against the expected values.
-
find_duplicates() - Identify and return duplicated rows in a data frame or linelist.
-
remove_duplicates() - Remove duplicates
-
perform_remove_constants() - Remove constant data.
-
remove_constants() - Remove constant data, including empty rows, empty columns, and columns with constant values.
-
replace_missing_values() - Replace missing values with
NA
-
replace_with_na() - Detect and replace values with
NAfrom a vector
-
timespan() - Calculate time span between dates
-
check_subject_ids() - Check whether the subject IDs comply with the expected format. When incorrect IDs are found, the function sends a warning and the user can call the
correct_subject_idsfunction to correct them.
-
correct_subject_ids() - Correct the wrong subject IDs based on the user-provided values.
-
check_subject_ids_oness() - Checks the uniqueness in values of the sample IDs column
-
check_date_sequence() - Checks whether the order in a sequence of date events is chronological. order.
-
is_date_sequence_ordered() - Check order of a sequence of date-events