Package index
Exported functions
cleanepi functions available to end-users
-
add_to_dictionary()
- Add an element to the data dictionary
-
add_to_report()
- Add an element to the report object
-
check_date_sequence()
- Check whether the order of the sequence of date-events is valid.
-
check_subject_ids()
- Check whether the subject IDs comply with the expected format. When incorrect IDs are found, the function sends a warning and the user can call the
correct_subject_ids()
function to correct them.
-
clean_data()
- Clean and standardize data
-
clean_using_dictionary()
- Perform dictionary-based cleaning
-
common_na_strings
- Common strings representing missing values
-
convert_numeric_to_date()
- Convert numeric to date
-
convert_to_numeric()
- Convert columns into numeric
-
correct_subject_ids()
- Correct the wrong subject IDs based on the user-provided values.
-
find_duplicates()
- Identify and return duplicated rows in a data frame or linelist.
-
print_report()
- Generate report from data cleaning operations
-
remove_constants()
- Remove constant data i.e. empty rows and columns and constant columns
-
remove_duplicates()
- Remove duplicates
-
replace_missing_values()
- Replace missing values with
NA
-
scan_data()
- Scan through a data frame and return the proportion of
missing
,numeric
,Date
,character
,logical
values.
-
standardize_column_names()
- Standardize column names of a data frame or linelist
-
standardize_dates()
- Standardize date variables
-
timespan()
- Calculate time span between dates
-
get_target_column_names()
- Get the names of the columns from which duplicates will be found
-
add_to_report()
- Add an element to the report object
-
get_sum()
- Get sum of numbers from a string
-
numbers_only()
- Detects whether a string contains only numbers or not.
-
clean_data()
- Clean and standardize data
-
scan_data()
- Scan through a data frame and return the proportion of
missing
,numeric
,Date
,character
,logical
values.
-
scan_in_character()
- Scan through a character column
-
print_report()
- Generate report from data cleaning operations
-
standardize_column_names()
- Standardize column names of a data frame or linelist
-
convert_numeric_to_date()
- Convert numeric to date
-
convert_to_numeric()
- Convert columns into numeric
-
detect_to_numeric_columns()
- Detect the numeric columns that appears as characters due to the presence of some character values in the column.
-
standardize_dates()
- Standardize date variables
-
date_check_column_existence()
- Check if date column exists in the given dataset
-
date_check_outsiders()
- Convert and update the date values
-
date_check_timeframe()
- Check date time frame
-
date_choose_first_good()
- Choose the first non-missing date from a data frame of dates
-
date_convert()
- Convert characters to dates
-
date_detect_complex_format()
- Detect complex date format
-
date_detect_day_or_month()
- Detect the appropriate abbreviation for day or month value
-
date_detect_format()
- Detect a date format with only 1 separator
-
date_detect_separator()
- Detect the special character that is the separator in the date values
-
date_detect_simple_format()
- Get format from a simple Date value
-
date_get_format()
- Infer date format from a vector or characters
-
date_get_part1()
- Split a string based on a pattern and return the first element of the resulting vector.
-
date_get_part2()
- Get part2 of date value
-
date_get_part3()
- Get part3 of date value
-
date_guess()
- Try and guess dates from a characters
-
date_guess_convert()
- Guess if a character vector contains Date values, and convert them to date
-
date_i_guess_and_convert()
- Extract date from a character vector
-
date_make_format()
- Build the auto-detected format
-
date_match_format_and_column()
- Check whether the number of provided formats matches the number of target columns to be standardized.
-
date_process()
- Process date variable
-
date_rescue_lubridate_failures()
- Find the dates that lubridate couldn't
-
date_trim_outliers()
- Trim dates outside of the defined boundaries
-
convert_numeric_to_date()
- Convert numeric to date
Dictionary-based substitution
Substitute given options from columns in a data frame with their corresponding values
-
dictionary_make_metadata()
- Make data dictionary for 1 field
-
add_to_dictionary()
- Add an element to the data dictionary
-
clean_using_dictionary()
- Perform dictionary-based cleaning
-
make_readcap_dictionary()
- Convert Redcap data dictionary into {matchmaker} dictionary format
-
construct_misspelled_report()
- Build the report for the detected misspelled values during dictionary-based data cleaning operation
-
detect_misspelled_options()
- Detect misspelled options in columns to be cleaned
-
print_misspelled_values()
- Print the detected misspelled values
-
find_duplicates()
- Identify and return duplicated rows in a data frame or linelist.
-
remove_duplicates()
- Remove duplicates
-
remove_constants()
- Remove constant data i.e. empty rows and columns and constant columns
-
replace_missing_values()
- Replace missing values with
NA
-
timespan()
- Calculate time span between dates
-
check_subject_ids()
- Check whether the subject IDs comply with the expected format. When incorrect IDs are found, the function sends a warning and the user can call the
correct_subject_ids()
function to correct them.
-
correct_subject_ids()
- Correct the wrong subject IDs based on the user-provided values.
-
check_date_sequence()
- Check whether the order of the sequence of date-events is valid.
-
is_date_sequence_ordered()
- Check order of a sequence of date-events