Skip to contents

When removing duplicates, users can specify a set columns to consider with the 'target_columns' argument.

Usage

remove_duplicates(data, target_columns = NULL)

Arguments

data

A input data frame or linelist.

target_columns

A vector of column names to use when looking for duplicates. When the input data is a linelist object, this parameter can be set to linelist_tags if you wish to look for duplicates on tagged columns only. Default is NULL.

Value

A data frame or linelist without the duplicates values and nor constant columns.

Examples

no_dups <- remove_duplicates(
  data           = readRDS(system.file("extdata", "test_linelist.RDS",
                                       package = "cleanepi")),
  target_columns = "linelist_tags"
)
#> Found 57 duplicated rows. Please consult the report for more details.