Skip to contents

This function builds pairs of vaccinated and unvaccinated individuals with similar characteristics. The function relies on the matching algorithm implemented in the package {MatchIt}, setting, by default, method = "nearest", ratio = 1, and distance = "mahalanobis". Exact and near characteristics are accepted, passed in the parameters exact and nearest, respectively. The parameter nearest must be provided together with the calipers as a named vector (e.g., nearest = c(characteristic1 = n1, characteristic2 = n2), where n1 and n2 are the calipers). The default matching method of the function is static. This means that pairs are matched once, without taking into account their vaccination, censoring, and outcome dates. After this, the pairs whose exposition time do not coincide are removed to avoid negative time-to-events. The function returns a matched and adjusted by exposition cohort, with information of the beginning of follow-up period of pairs (t0_follow_up), corresponding to the vaccination date of the vaccinated individual, the individual time-to-event (time_to_event) and the outcome status (outcome_status), both taking into account the right-censoring dates. Pairs are censored if the vaccinated or unvaccinated partner was previously censored (i.e., if censoring_date_col is informed) and the censor occurs before their outcomes. Rolling calendar matching method will be included in future releases.

Usage

match_cohort(
  data_set,
  outcome_date_col,
  censoring_date_col,
  start_cohort,
  end_cohort,
  method = "static",
  nearest = NULL,
  exact = NULL,
  immunization_date_col = "immunization_date",
  vacc_status_col = "vaccine_status",
  vaccinated_status = "v",
  unvaccinated_status = "u"
)

Arguments

data_set

data.frame with cohort information (see example).

outcome_date_col

Name of the column that contains the outcome dates.

censoring_date_col

Name of the column that contains the censoring date. NULL by default.

start_cohort

Start date of the study.

end_cohort

End date of the study.

method

Method to match the cohort. Default is static.

nearest

Named vector with name(s) of column(s) for nearest matching and caliper(s) for each variable (e.g., nearest = c("characteristic1" = n1, "characteristic2" = n2), where n1 and n2 are the calipers). Default is NULL.

exact

Name(s) of column(s) for exact matching. Default is NULL.

immunization_date_col

Name of the column that contains the immunization date to set the beginning of the follow-up period (t0_follow_up). Default is immunization_date.

vacc_status_col

Name of the column containing the vaccination. Default is vaccine_status.

vaccinated_status

Status assigned to the vaccinated population. Default is v.

unvaccinated_status

Status assigned to the unvaccinated population. Default is u.

Value

object of the class match. List with results from static match: match: data.frame with adjusted cohort, summary: matching summary, balance_all: balance of the cohort before matching, balance_matched: balance of the cohort after matching.

Four columns are added to the structure provided in data_set: subclass: ID of, matched pair, t0_follow_up: beginning of follow-up period for pair, time_to_event: time to event, and outcome_status: outcome status (1:positive, 0: negative).

Examples

# Define start and end dates of the study
start_cohort <- as.Date("2044-01-01")
end_cohort <- as.Date("2044-12-31")

# Create `data.frame` with information on immunization
cohortdata <- make_immunization(
  data_set = cohortdata,
  outcome_date_col = "death_date",
  censoring_date_col = "death_other_causes",
  immunization_delay = 14,
  vacc_date_col = "vaccine_date_2",
  end_cohort = end_cohort
)

# Match the data_set
matching <- match_cohort(
  data_set = cohortdata,
  outcome_date_col = "death_date",
  censoring_date_col = "death_other_causes",
  start_cohort = start_cohort,
  end_cohort = end_cohort,
  method = "static",
  exact = "sex",
  nearest = c(age = 1)
)

# Check match balance and summary
# `warnings_log = TRUE` displays the logs created
# during the iterative process.
summary(matching, warnings_log = TRUE)
#> Balance all:
#>                u          v        smd
#> age   30.9928364 48.1656078  0.8599792
#> sex_F  0.4834661  0.5758635  0.1859192
#> sex_M  0.5165339  0.4241365 -0.1859192
#> 
#> Balance matched:
#>                u          v       smd
#> age   43.1952984 45.6560579 0.1330454
#> sex_F  0.5555154  0.5555154 0.0000000
#> sex_M  0.4444846  0.4444846 0.0000000
#> 
#> Summary:
#>               u     v
#> All       62538 37462
#> Matched   33180 33180
#> Unmatched 29358  4282
#> 
#> Warnings:
#> Error at iteration 2: Assertion on 'data_set' failed: Must have at least 1 rows, but has 0 rows.- skipping to next 
#> Error at iteration 3: Assertion on 'data_set' failed: Must have at least 1 rows, but has 0 rows.- skipping to next 
#> Error at iteration 4: Assertion on 'data_set' failed: Must have at least 1 rows, but has 0 rows.- skipping to next 
#> Error at iteration 5: Assertion on 'data_set' failed: Must have at least 1 rows, but has 0 rows.- skipping to next 
#> Error at iteration 5: Assertion on 'data_set' failed: Must have at least 1 rows, but has 0 rows.- skipping to next 
#> Matches before iterating: 66014 
#> Removed before iterating 946 
#> Matches after iterating: 66360 
#> Removed after iterating 600 

# Extract matched data
cohortdata_match <- get_dataset(matching)

# View of mached cohort
head(cohortdata_match)
#>         id sex age death_date death_other_causes vaccine_date_1 vaccine_date_2
#> 1 afade1b2   F  37       <NA>               <NA>           <NA>           <NA>
#> 2 556c8c76   M  19       <NA>               <NA>           <NA>           <NA>
#> 3 04edf85a   M  50       <NA>               <NA>           <NA>           <NA>
#> 4 7e51a18e   F   8       <NA>               <NA>           <NA>           <NA>
#> 5 c5a83f56   M  66       <NA>               <NA>           <NA>           <NA>
#> 6 7f675ec3   M  29       <NA>               <NA>     2044-04-09     2044-04-30
#>   vaccine_1 vaccine_2 immunization_date vaccine_status subclass t0_follow_up
#> 1      <NA>      <NA>              <NA>              u     1330   2044-12-05
#> 2      <NA>      <NA>              <NA>              u    14154   2044-12-15
#> 3      <NA>      <NA>              <NA>              u    22648   2044-08-14
#> 4      <NA>      <NA>              <NA>              u     8438   2044-12-24
#> 5      <NA>      <NA>              <NA>              u     5550   2044-06-26
#> 6    BRAND1    BRAND1        2044-05-14              v    20552   2044-05-14
#>   time_to_event outcome_status
#> 1            26              0
#> 2            16              0
#> 3           139              0
#> 4             7              0
#> 5           188              0
#> 6           231              0