Skip to contents

ColOpenData can be used to access open geospatial data from Colombia. This data is retrieved from the Geostatistical National Framework (MGN), published by the National Administrative Department of Statistics (DANE). The MGN contains the political-administrative division and is used to reference census statistical information. Further information can be obtained directly from DANE here.

This package contains the 2018’s version of the MGN, which also included a summarized version of the National Population and Dwelling Census (CNPV) in different aggregation levels. Each level is stored in a different dataset, which can be retrieved using the download_geospatial function, which requires three arguments:

  • spatial_level character with the spatial level to be consulted
  • include_geom logical for including (or not) geometry. Default is TRUE
  • include_cnpv logical for including (or not) CNPV demographic and socioeconomic information Default is TRUE.

Available levels of aggregation come from the official spatial division provided by DANE, with their names corresponding to:

Code Level Name
DPTO Department DANE_MGN_2018_DPTO
MPIO Municipality DANE_MGN_2018_MPIO
MPIOCL Municipality including Class DANE_MGN_2018_MPIOCL
MZN Block DANE_MGN_2018_MZN
SECR Rural Sector DANE_MGN_2018_SECR
SECU Urban Sector DANE_MGN_2018_SECU
SETR Rural Section DANE_MGN_2018_SETR
SETU Urban Section DANE_MGN_2018_SETU
ZU Urban Zone DANE_MGN_2018_ZU

In this vignette you will learn:

  1. How to download geospatial data using ColOpenData
  2. How to use census data included in geospatial datasets
  3. How to visualize spatial data using leaflet and ggplot2

We will be using geospatial data at the level of Municipality (MPIO) for the department of Tolima and we will calculate the percentage of houses with internet connection at each municipality. Later, we will build some plots using the previously mentioned approaches for dynamic and static plots.

We will start by importing the needed libraries.

Disclaimer: all data is loaded to the environment in the user’s R session, but is not downloaded to user’s computer. Spatial datasets can be very long and might take a while to be loaded in the environment

Downloading geospatial data

First, we download the data using the function download_geospatial, including the geometries and the census related information.

mpio <- download_geospatial(
  spatial_level = "MPIO",
  include_geom = TRUE,
  include_cnpv = TRUE
)
#> Original data is retrieved from the National Administrative Department
#> of Statistics (Departamento Administrativo Nacional de Estadística -
#> DANE).
#> Reformatted by package authors.
#> Stored by Universidad de Los Andes under the Epiverse TRACE iniative.
head(mpio)
#> Simple feature collection with 6 features and 90 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -76.1027 ymin: 0.9764735 xmax: -74.89527 ymax: 2.326755
#> Geodetic CRS:  WGS 84
#>   codigo_departamento codigo_municipio_sin_con              municipio
#> 1                  18                      001              Florencia
#> 2                  18                      029                Albania
#> 3                  18                      094 Belén de Los Andaquíes
#> 4                  18                      247            El Doncello
#> 5                  18                      256              El Paujíl
#> 6                  18                      410           La Montañita
#>   codigo_municipio version       area  latitud  longitud encuestas enc_etnico
#> 1            18001    2018 2547637532 1.749139 -75.55824     71877         32
#> 2            18029    2018  414122070 1.227865 -75.88233      2825         24
#> 3            18094    2018 1191618572 1.500923 -75.87565      4243         54
#> 4            18247    2018 1106076151 1.791386 -75.19394      8809          0
#> 5            18256    2018 1234734145 1.617746 -75.23404      5795          0
#> 6            18410    2018 1701061430 1.302860 -75.23573      5113         15
#>   enc_no_etnico enc_resguardo_indigena enc_comun_negras enc_area_protegida
#> 1         71845                     32                0                  0
#> 2          2801                     24                0                  0
#> 3          4189                     54                0                  1
#> 4          8809                      0                0                  0
#> 5          5795                      0                0                  0
#> 6          5098                     15                0                  0
#>   enc_area_no_protegida un_vivienda un_mixto un_no_res un_lea
#> 1                 71877       61176     2178      8436     87
#> 2                  2825        1826       49       948      2
#> 3                  4242        3223      109       900     11
#> 4                  8809        6598      357      1850      4
#> 5                  5795        4891      204       695      5
#> 6                  5113        4077      241       786      9
#>   un_mixto_no_res_industria un_mixto_no_res_comercio un_mixto_no_res_servicios
#> 1                        39                     1550                       566
#> 2                         3                       34                        12
#> 3                         4                       99                         6
#> 4                        11                      259                        87
#> 5                         4                      161                        38
#> 6                         2                      205                        32
#>   un_mixto_no_res_agro un_mixto_no_res_sin_info un_no_res_industria
#> 1                   18                        5                  54
#> 2                    0                        0                   3
#> 3                    0                        0                   8
#> 4                    0                        0                  21
#> 5                    1                        0                   5
#> 6                    2                        0                   3
#>   un_no_res_comercio un_no_res_servicios un_no_res_agro un_no_res_institucional
#> 1               2591                1061            535                     368
#> 2                 21                  32            728                      53
#> 3                 88                   6              2                      61
#> 4                334                 124            807                      89
#> 5                239                 104              4                      70
#> 6                103                 123             17                      84
#>   un_no_res_lote un_no_res_parque un_no_res_minero un_no_res_proteccion
#> 1           3172              233                7                   19
#> 2             92                5                0                    0
#> 3            626                8                0                    8
#> 4            361               42                0                   27
#> 5            211               16                0                    0
#> 6            362               30                0                    0
#>   u_no_res_construccion u_no_res_sin_info viviendas viv_casa viv_apartamento
#> 1                   371                25     63354    47817           13764
#> 2                    14                 0      1875     1793              40
#> 3                    93                 0      3332     3189             113
#> 4                    39                 6      6955     6006             775
#> 5                    45                 1      5095     4700             145
#> 6                    58                 6      4318     3890             224
#>   viv_cuarto viv_trad_indigena viv_trad_etnica viv_otro viv_ocupado_personas
#> 1       1624                21               8      120                49809
#> 2         22                17               0        3                 1409
#> 3         24                 2               2        2                 2883
#> 4        160                 1               1       12                 5767
#> 5        188                 4               1       57                 4568
#> 6        161                 6               2       35                 3553
#>   viv_ocupado_sin_personas viv_temporal viv_desocupado hogares viv_energia
#> 1                     2681         2150           8714   51430       48638
#> 2                       13           55            398    1559        1300
#> 3                        2          107            340    3161        2595
#> 4                      304          388            496    6129        5375
#> 5                        5          323            199    5848        4195
#> 6                      151          308            306    3748        2159
#>   viv_sin_energia viv_energia_estrato_1 viv_energia_estrato_2
#> 1            1171                 34851                 10343
#> 2             109                  1184                   106
#> 3             288                  2118                   366
#> 4             392                  3548                   962
#> 5             373                  3330                   770
#> 6            1394                  1964                   144
#>   viv_energia_estrato_3 viv_energia_estrato_4 viv_energia_estrato_5
#> 1                  2169                   509                    13
#> 2                     1                     0                     0
#> 3                    17                     2                     1
#> 4                   793                     1                     1
#> 5                    84                     0                     1
#> 6                     9                     1                     0
#>   viv_energia_estrato_6 viv_energia_sin_estrato viv_acueducto viv_sin_acueducto
#> 1                     3                     750         45179              4630
#> 2                     0                       9           808               601
#> 3                     0                      91          2017               866
#> 4                     1                      69          4175              1592
#> 5                     1                       9          2505              2063
#> 6                     0                      41          1441              2112
#>   viv_alcantarillado viv_sin_alcantarillado viv_gas viv_sin_gas
#> 1              41138                   8671   37028       12074
#> 2                703                    706      26        1371
#> 3               1806                   1077      52        2796
#> 4               4323                   1444      57        5549
#> 5               2359                   2209    1463        3041
#> 6               1329                   2224      67        3454
#>   viv_sin_info_gas viv_rec_basuras viv_sin_rec_basuras viv_internet
#> 1              707           45491                4318        13362
#> 2               12             727                 682           27
#> 3               35            1905                 978           73
#> 4              161            4348                1419          211
#> 5               64            2414                2154          125
#> 6               32            1273                2280           64
#>   viv_sin_internet viv_sin_info_internet personas per_leas
#> 1            35727                   720   156789     4315
#> 2             1370                    12     4514      151
#> 3             2775                    35     9075      346
#> 4             5395                   161    17775      203
#> 5             4379                    64    13014      192
#> 6             3457                    32    12128      604
#>   per_hogares_particulares hombres mujeres per_0_a_9 per_10_a_19 per_20_a_29
#> 1                   152474   77620   79169     25503       30249       29951
#> 2                     4363    2323    2191       725        1016         717
#> 3                     8729    4551    4524      1592        2254        1388
#> 4                    17572    8790    8985      3047        3811        2601
#> 5                    12822    6601    6413      2346        2882        2170
#> 6                    11524    6437    5691      2229        3022        1836
#>   per_30_a_39 per_40_a_49 per_50_a_59 per_60_a_69 per_70_a_79 per_80_mas
#> 1       23602       17235       14349        8969        4687       2244
#> 2         568         536         445         253         162         92
#> 3        1121         986         816         487         286        145
#> 4        2302        2032        1792        1135         707        348
#> 5        1587        1460        1188         703         430        248
#> 6        1563        1441        1010         578         323        126
#>   per_ed_primaria per_ed_secundaria per_ed_superior per_ed_posgrado
#> 1           37918             14123           14606             856
#> 2            1696               150              98               0
#> 3            2596               418             171              12
#> 4            6091               712             347              26
#> 5            4805               261             226               0
#> 6            5011               384             134               0
#>   per_ed_sin_educacion per_ed_sin_info shape_length shape_area
#> 1                 5892            3799     2.942508 0.20692777
#> 2                  215              46     1.112829 0.03361758
#> 3                  720             123     2.234657 0.09674460
#> 4                 1095             171     3.154370 0.08986744
#> 5                  916              99     3.529316 0.10030928
#> 6                  724             182     3.402939 0.13817351
#>                             geom
#> 1 MULTIPOLYGON (((-75.42074 2...
#> 2 MULTIPOLYGON (((-75.89506 1...
#> 3 MULTIPOLYGON (((-75.78705 1...
#> 4 MULTIPOLYGON (((-75.36167 2...
#> 5 MULTIPOLYGON (((-75.36638 2...
#> 6 MULTIPOLYGON (((-75.40346 1...

After downloading, we have to filter by the municipality code using the DIVIPOLA code for Tolima. For further details on DIVIPOLA codification and functions please refer to Documentation and Dictionaries

name_to_code_dep("Tolima")
#> [1] "73"

To understand which column contains the departments’ codes and filter for Tolima, we will need the corresponding dataset dictionary. To download the dictionary we can use the dictionary function. This function uses the dataset name to download the associated information. For further information please refer to the documentation on dictionaries previously mentioned.

dict <- dictionary("DANE_MGN_2018_MPIO")

head(dict)
#>                   variable         tipo longitud
#> 1      codigo_departamento         Text        2
#> 2 codigo_municipio_sin_con         Text        3
#> 3                municipio         Text      250
#> 4         codigo_municipio         Text        5
#> 5                  version Long Integer       NA
#> 6                     area       Double       NA
#>                                                                                     descripcion
#> 1                                                                       Código del departamento
#> 2                                                            Código que identifica al municipio
#> 3                                                                          Nombre del municipio
#> 4                                                Código concatenado que identifica al municipio
#> 5                                                              Año de la información geográfica
#> 6 Área del municipio en metros cuadrados  (Sistema de coordenadas planas MAGNA_Colombia_Bogota)
#>   categoria_original
#> 1               <NA>
#> 2               <NA>
#> 3               <NA>
#> 4               <NA>
#> 5               <NA>
#> 6               <NA>

After exploring the dictionary, we can identify the column that contains the individual municipality codes is codigo_departamento. We will filter based on that column.

tolima <- mpio %>% filter(codigo_departamento == "73")

To calculate the percentage of houses with internet connection, we will need to know the number of houses with internet connection and the total of houses in each SECU. From the dictionary we get that the number of houses with internet connection is STP19_INT1 and the total of houses is STVIVIENDA. We will calculate the percentage as follows:

tolima <- tolima %>% mutate(internet = round(viv_internet / viviendas, 2))

Static plots (ggplot2)

ggplot2 can be used to generate static plots of spatial data by using the geometry geom_sf as follows:

ggplot(data = tolima) +
  geom_sf(mapping = aes(fill = internet), color = NA)

The generated plot by default uses a blue palette, which makes it hard to observe small differences in internet coverage across municipalities. Color palettes and themes can be defined for each plot using the aesthetic and scales, which can be consulted in the ggplot2 documentation. We will use a gradient with a two-color diverging palette, to make the differences more visible.

ggplot(data = tolima) +
  geom_sf(mapping = aes(fill = internet), color = NA) +
  theme_minimal() +
  theme(
    panel.grid = element_blank(),
    axis.text = element_blank(),
    axis.ticks = element_blank()
  ) +
  scale_fill_gradient("Percentage", low = "#10bed2", high = "#deff00") +
  ggtitle(
    label = "Internet coverage",
    subtitle = "Tolima, Colombia"
  )

Dynamic plots (leaflet)

For dynamic plots, we can use leaflet, which is an open-source library for interactive maps. To create the same plot we first will create the color palette.

colfunc <- colorRampPalette(c("#10bed2", "#deff00"))
pal <- colorNumeric(
  palette = colfunc(100),
  domain = tolima$internet
)

With the previous color palette we can generate the interactive plot. The package also includes open source maps for the base map like OpenStreetMap and CartoDB. For further details on leaflet, please refer to the package’s documentation.

leaflet(tolima) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(
    stroke = TRUE,
    weight = 0,
    color = NA,
    fillColor = ~ pal(tolima$internet),
    fillOpacity = 1,
    popup = paste0(tolima$internet)
  ) %>%
  addLegend(
    position = "bottomright",
    pal = pal,
    values = ~ tolima$internet,
    opacity = 1,
    title = "Internet Coverage"
  )