Skip to contents

ColOpenData can be used to access open geospatial data from Colombia. This data is retrieved from the Geostatistical National Framework (MGN), published by the National Administrative Department of Statistics (DANE). The MGN contains the political-administrative division and is used to reference census statistical information. Further information can be obtained directly from DANE here.

This package contains the 2018’s version of the MGN, which also included a summarized version of the National Population and Dwelling Census (CNPV) in different aggregation levels. Each level is stored in a different dataset, which can be retrieved using the download_geospatial function, which requires three arguments:

  • dataset character with the geospatial dataset name.
  • include_geom logical for including (or not) geometry. Default is TRUE.
  • include_cnpv logical for including (or not) CNPV demographic and socioeconomic information. Default is TRUE.

To better understand dataset names and details go to Documentation and Dictionaries.

Details for geospatial datasets relate to the level of aggregation as follows:

Code Level
DANE_MGN_2018_DPTO Department
DANE_MGN_2018_MPIO Municipality
DANE_MGN_2018_MPIOCL Municipality including Class
DANE_MGN_2018_MZN Block
DANE_MGN_2018_SECR Rural Sector
DANE_MGN_2018_SECU Urban Sector
DANE_MGN_2018_SETR Rural Section
DANE_MGN_2018_SETU Urban Section
DANE_MGN_2018_ZU Urban Zone

In this vignette you will learn: 1. How to download geospatial data using ColOpenData 2. How to use census data included in geospatial datasets 3. How to visualize spatial data using leaflet and ggplot2

We will be using geospatial data at the level of Urban Section (SECU) for the municipality of Tunja and will calculate the percentage of houses with internet connection at each SECU. Later, we will build some plots using the previously mentioned approaches for dynamic and static plots.

We will start by importing the needed libraries.

Disclaimer: all data is loaded to the environment in the user’s R session, but is not downloaded to user’s computer.

Downloading geospatial data

First, we download the data using the function download_geospatial, including the geometries and the census related information.

# Download complete dataset
secu <- download_geospatial(
  dataset = "DANE_MGN_2018_SECU",
  include_geom = TRUE,
  include_cnpv = TRUE
)
#> Original data is provided by the National Administrative Department of
#> Statistics (DANE).
#> Reformatted by the package authors.
#> Stored and redistributed by Universidad de Los Andes under the Epiverse
#> TRACE iniative.
head(secu)
#> Simple feature collection with 6 features and 101 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -75.54804 ymin: 6.292379 xmax: -75.54093 ymax: 6.302778
#> Geodetic CRS:  WGS 84
#>   DPTO_CCDGO MPIO_CCDGO MPIO_CDPMP CLAS_CCDGO SETR_CCDGO SETR_CCNCT SECR_CCDGO
#> 1         05        001      05001          1        000  050011000         00
#> 2         05        001      05001          1        000  050011000         00
#> 3         05        001      05001          1        000  050011000         00
#> 4         05        001      05001          1        000  050011000         00
#> 5         05        001      05001          1        000  050011000         00
#> 6         05        001      05001          1        000  050011000         00
#>    SECR_CCNCT ZU_CCDGO ZU_CDIVI SETU_CCDGO         SETU_CCNCT SECU_CCDGO
#> 1 05001100000      000 05001000       0101 050011000000000101         01
#> 2 05001100000      000 05001000       0101 050011000000000101         02
#> 3 05001100000      000 05001000       0101 050011000000000101         03
#> 4 05001100000      000 05001000       0101 050011000000000101         04
#> 5 05001100000      000 05001000       0101 050011000000000101         05
#> 6 05001100000      000 05001000       0101 050011000000000101         06
#>             SECU_CCNCT              DATOS_ANM VERSION     AREA  LATITUD
#> 1 05001100000000010101 SECCION URBANA CLASE 1    2018 71892.36 6.300526
#> 2 05001100000000010102 SECCION URBANA CLASE 1    2018 31002.72 6.298693
#> 3 05001100000000010103 SECCION URBANA CLASE 1    2018 45955.40 6.295615
#> 4 05001100000000010104 SECCION URBANA CLASE 1    2018 61747.24 6.295751
#> 5 05001100000000010105 SECCION URBANA CLASE 1    2018 27401.79 6.293369
#> 6 05001100000000010106 SECCION URBANA CLASE 1    2018 38415.27 6.294739
#>    LONGITUD STCTNENCUE STP3_1_SI STP3_2_NO STP3A_RI STP3B_TCN STP4_1_SI
#> 1 -75.54387       1251         0      1251        0         0         0
#> 2 -75.54591        668         0       668        0         0         0
#> 3 -75.54674       1289         0      1289        0         0         0
#> 4 -75.54236       1465         0      1465        0         0         0
#> 5 -75.54291        673         0       673        0         0         0
#> 6 -75.54455        742         0       742        0         0         0
#>   STP4_2_NO STP9_1_USO STP9_2_USO STP9_3_USO STP9_4_USO STP9_2_1_M STP9_2_2_M
#> 1      1251       1158         24         69          0          2         20
#> 2       668        630          6         32          0          1          4
#> 3      1289       1216         18         55          0          1          7
#> 4      1465       1276         15        174          0          3          6
#> 5       673        546         13        114          0          2          4
#> 6       742        695         12         35          0          0          9
#>   STP9_2_3_M STP9_2_4_M STP9_2_9_M STP9_3_1_N STP9_3_2_N STP9_3_3_N STP9_3_4_N
#> 1          2          0          0          1         45          8          0
#> 2          1          0          0          0         25          3          0
#> 3         10          0          0          2         36         10          0
#> 4          6          0          0          5        106         33          0
#> 5          7          0          0          2         71         27          0
#> 6          3          0          0          2         17          9          0
#>   STP9_3_5_N STP9_3_6_N STP9_3_7_N STP9_3_8_N STP9_3_9_N STP9_3_10 STP9_3_99
#> 1          0          5          4          0          0         6         0
#> 2          1          2          0          0          0         0         1
#> 3          1          4          2          0          0         0         0
#> 4          1         25          1          1          0         2         0
#> 5          3          4          5          0          0         0         2
#> 6          5          1          1          0          0         0         0
#>   STVIVIENDA STP14_1_TI STP14_2_TI STP14_3_TI STP14_4_TI STP14_5_TI STP14_6_TI
#> 1       1182        830        320         32          0          0          0
#> 2        636        407        204         24          1          0          0
#> 3       1234        688        509         37          0          0          0
#> 4       1291        897        335         57          0          0          2
#> 5        559        257        270         32          0          0          0
#> 6        707        573        110         23          0          0          1
#>   STP15_1_OC STP15_2_OC STP15_3_OC STP15_4_OC TSP16_HOG STP19_EC_1 STP19_ES_2
#> 1       1142          4          2         34      1149       1116         26
#> 2        610          6          2         18       620        589         21
#> 3       1153         24          0         57      1190       1139         14
#> 4       1222          0          5         64      1250       1222          0
#> 5        524          0          0         35       535        523          1
#> 6        678          0          3         26       694        677          1
#>   STP19_EE_1 STP19_EE_2 STP19_EE_3 STP19_EE_4 STP19_EE_5 STP19_EE_6 STP19_EE_9
#> 1        576        503          5          0          0          1         31
#> 2        158        425          3          0          0          0          3
#> 3         68       1060          2          9          0          0          0
#> 4        306        911          2          1          0          0          2
#> 5         69        450          1          0          1          0          2
#> 6        112        558          4          2          0          0          1
#>   STP19_ACU1 STP19_ACU2 STP19_ALC1 STP19_ALC2 STP19_GAS1 STP19_GAS2 STP19_GAS9
#> 1       1128         14       1129         13        596        522         24
#> 2        598         12        602          8        485        121          4
#> 3       1139         14       1138         15        938        207          8
#> 4       1217          5       1214          8        788        429          5
#> 5        522          2        523          1        363        158          3
#> 6        655         23        667         11        404        272          2
#>   STP19_REC1 STP19_REC2 STP19_INT1 STP19_INT2 STP19_INT9 STP27_PERS STPERSON_L
#> 1       1132         10        312        806         24       3624          0
#> 2        606          4        220        384          6       1915          0
#> 3       1149          4        462        683          8       3637          0
#> 4       1216          6        322        895          5       3751          0
#> 5        518          6        211        310          3       1511          0
#> 6        674          4        234        442          2       2267          0
#>   STPERSON_S STP32_1_SE STP32_2_SE STP34_1_ED STP34_2_ED STP34_3_ED STP34_4_ED
#> 1       3624       1754       1870        539        637        777        514
#> 2       1915        880       1035        250        370        370        273
#> 3       3637       1791       1846        465        611        742        603
#> 4       3751       1771       1980        517        648        791        525
#> 5       1511        745        766        176        247        334        203
#> 6       2267       1083       1184        338        439        443        353
#>   STP34_5_ED STP34_6_ED STP34_7_ED STP34_8_ED STP34_9_ED STP51_PRIM STP51_SECU
#> 1        398        378        233         95         53       1208        296
#> 2        229        225        110         60         28        647        119
#> 3        412        377        256        112         59       1260        215
#> 4        440        419        223        125         63       1124        636
#> 5        204        170         93         54         30        351        483
#> 6        252        234        112         60         36        717        631
#>   STP51_SUPE STP51_POST STP51_13_E STP51_99_E Shape_Leng  Shape_Area
#> 1        214          0        154         65 0.06413275 5.87132e-06
#> 2        143          0         91         23 0.02939734 2.53192e-06
#> 3        227          0        153         75 0.05760309 3.75305e-06
#> 4        312          0        116         59 0.04996887 5.04275e-06
#> 5        183          0         46         33 0.02597387 2.23783e-06
#> 6        123          0         59         29 0.03346429 3.13727e-06
#>                             geom
#> 1 MULTIPOLYGON (((-75.54478 6...
#> 2 MULTIPOLYGON (((-75.5458 6....
#> 3 MULTIPOLYGON (((-75.54755 6...
#> 4 MULTIPOLYGON (((-75.54172 6...
#> 5 MULTIPOLYGON (((-75.54189 6...
#> 6 MULTIPOLYGON (((-75.54537 6...

After downloading, we have to filter by the municipality code using the DIVIPOLA code for Tunja. DIVIPOLA codification is standardized for the whole country, and contains department’s and municipalities’ codes. Departments have two digits for individual identification, while municipalities have five. The codes for each municipality and department can be consulted in the following table.

To get the DIVIPOLA code of a municipality or department we can use the auxiliary functions divipola_municipality_code and divipola_department_code in ColOpenData. To retrieve a municipality code we must include the department name and the municipality name. This is to consider repetition in municipalities’ names across departments.

divipola_municipality_code("BOYACA", "TUNJA")
#> [1] "15001"

We can see that the DIVIPOLA code for Tunja is "15001". We can use this code for filtering.

To understand which column contains the municipalities’ codes and filter for Tunja, we will need the corresponding dataset dictionary. To download the dictionary we can use the dictionary function. This function uses the dataset name to download the associated information. For further information please refer to the documentation on dictionaries previously mentioned.

# Urban section dictionary
dict <- dictionary("DANE_MGN_2018_SECU")
head(dict)
#>     variable tipo longitud
#> 1 DPTO_CCDGO Text        2
#> 2 MPIO_CCDGO Text        3
#> 3 MPIO_CDPMP Text        5
#> 4 CLAS_CCDGO Text        1
#> 5 SETR_CCDGO Text        3
#> 6 SETR_CCNCT Text        9
#>                                                                          descripcion
#> 1                                                            Código del departamento
#> 2                                                               Código del municipio
#> 3                                     Código concatenado que identifica al municipio
#> 4 Código de la clase 1 cabecera municipal, 2 centro poblado y 3 área resto municipal
#> 5                                                             Código de sector rural
#> 6 Código de sector rural concatenado (departamento, municipio, clase y sector rural)
#>   categoria_original
#> 1               <NA>
#> 2               <NA>
#> 3               <NA>
#> 4               <NA>
#> 5               <NA>
#> 6               <NA>

After exploring the dictionary, we can identify the column that contains the individual municipality codes is MPIO_CDPMP. This column contains the concatenated code for department and municipality, which is a five digit unique identifier for every municipality in the country.

To calculate the percentage of houses with internet connection, we will need to know the number of houses with internet connection and the total of houses in each SECU. From the dictionary we get that the number of houses with internet connection is STP19_INT1 and the total of houses is STVIVIENDA. We will calculate the percentage as follows:

# Filter for Tunja
tunja <- secu %>% filter(MPIO_CDPMP == "15001")

# Calculate internet coverage percentage
tunja <- tunja %>% mutate(INT_PERC = round(STP19_INT1 / STVIVIENDA, 2))

Static plots (ggplot2)

ggplot2 can be used to generate static plots of spatial data by using the geometry geom_sf as follows:

ggplot(data = tunja) +
  geom_sf(mapping = aes(fill = INT_PERC), color = NA)

Color palettes and themes can be defined for each plot using the aesthetic and scales, which can be consulted in the ggplot2 documentation.

# ggplot2
ggplot(data = tunja) +
  geom_sf(mapping = aes(fill = INT_PERC), color = NA) +
  theme_minimal() +
  theme(
    panel.grid = element_blank(),
    axis.text = element_blank(),
    axis.ticks = element_blank()
  ) +
  scale_fill_gradient("Percentage", low = "#10bed2", high = "#deff00") +
  ggtitle(
    label = "Internet coverage",
    subtitle = "Tunja, Colombia"
  )

Dynamic plots (leaflet)

For dynamic plots, we can use leaflet, which is an open-source library for interactive maps. To create the same plot we first will create a color palette.

# Color Palette
colfunc <- colorRampPalette(c("#10bed2", "#deff00"))
pal <- colorNumeric(
  palette = colfunc(100),
  domain = tunja$INT_PERC
)

With the previous color palette we can generate the interactive plot. The package also includes open source maps for the base map like OpenStreetMap and CartoDB. For further details on leaflet, please refer to the package’s documentation.

# Leaflet
leaflet(tunja) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(
    stroke = TRUE,
    weight = 0,
    color = NA,
    fillColor = ~ pal(tunja$INT_PERC),
    fillOpacity = 1,
    popup = paste0(tunja$INT_PERC)
  ) %>%
  addLegend(
    position = "bottomright",
    pal = pal,
    values = ~ tunja$INT_PERC,
    opacity = 1,
    title = "Internet Coverage"
  )