Correct the wrong subject IDs based on the user-provided values.
Source:R/standardize_subject_ids.R
correct_subject_ids.Rd
After detecting incorrect subject IDs from the check_subject_ids()
function, use this function to provide the correct IDs and perform the
substitution.
Examples
data <- readRDS(
system.file("extdata", "test_df.RDS", package = "cleanepi")
)
# detect the incorrect subject ids i.e. IDs that do not have any or both of
# the followings:
# - starts with 'PS',
# - ends with 'P2',
# - has a number within 1 and 100,
# - contains 7 characters.
dat <- check_subject_ids(
data = data,
target_columns = "study_id",
prefix = "PS",
suffix = "P2",
range = c(1, 100),
nchar = 7
)
#> ! Detected no missing, no duplicated, and 3 incorrect subject IDs.
#> ℹ Enter `print_report(data = dat, "incorrect_subject_id")` to access them,
#> where "dat" is the object used to store the output from this operation.
#> ℹ You can use the `correct_subject_ids()` function to correct them.
# display rows with invalid subject ids
print_report(dat, "incorrect_subject_id")
#> $invalid_subject_ids
#> idx ids
#> 1 3 PS004P2-1
#> 2 5 P0005P2
#> 3 7 PB500P2
#>
# generate the correction table
correction_table <- data.frame(
from = c("P0005P2", "PB500P2", "PS004P2-1"),
to = c("PB005P2", "PB050P2", "PS004P2")
)
# perform the correction
dat <- correct_subject_ids(
data = dat,
target_columns = "study_id",
correction_table = correction_table
)