Brazilian municipality evolution and time-series analysis

Sumário em português:

Analisando dados estatísticos do Brasil em escala municipal encontramos o problema da divisão de municípios ao longo dos anos. A evolução da malha municipal dificulta a análise de dados em series temporais por que os novos municípios não tem dados para os anos prévios a sua criação,- Além disso os velhos municípios que foram divididos não podem ser comparados antes e depois da sua divisão por que perderam território e gente. No seguinte texto apresento uma metodologia (em inglês) para lidar com este problema agregando os municípios em grupos baseados na evolução da malha municipal ao longo dos anos. Os grupos baseados nessa agregação podem ser baixados para os anos 1980, 1991 e 2000 referente à malha municipal de 2010. As agregações permitem analisar todo tipo de dado socioeconómico e demográfico do Instituto Brasileiro de Geografia e Estatística – IBGE e de outras instituições académicas e estaduais.

Long version

In statistical analysis we may encounter the problem that our units of observation change over time. This is true for the case of Brazilian municipality data where the units of observation change due to the fact that a lot of new municipalities were created during the last decades. In 2010 Brazil possessed 5567 municipalities of which 3991 where created between 1940 and 2010. Those new municipalities lack data for the previous years to their existence. If a municipality was e.g. created in 1994, there is no demographic data previous to 1994. Furthermore, since municipalities are not created out of nothing, one or multiple other municipalities might suddenly experience a decrease in population in the subsequent year (here 1995) because their areas and populations were divided. This is also true for a large set of other socio-economic variables that are produced in the studies of the Brazilian Institute for Geography and Statistics (IBGE) and others. If we analyze the development of municipalities over time we have to account for this problem in order to avoid artificial increases or decreases in our variables, that might be largely a legacy of administrative change.

In Brazil, municipality creation spurred especially in the 1980s in course of the decentralization policies and ongoing population increase in the rural and urban areas. Although IBGE provides information on the evolution of the so called “Malha Municipal“ I was not able to identify a method or data-set that specifically solved that problem.

Therefore, I developed a set of algorithms in R that help to group municipalities on a base year e.g. 1991 in order to create an aggregation level that is neutral to municipality change. I applied these algorithms for the years 1980,1991 and 2000, which are the years of the population censuses prior to 2010. Furthermore, I limited the analysis to the states of the Brazilian Amazon, since some manual edits have to be made to the raw data, if municipalities where e.g. misspelled by the staff that created the documentation on the evolution of the „Malha Municipal“. Doing it for the whole country with the R script should be still feasible with a few hours of extra work. If you are interested, I can provide you the script and some explanations on how to use it.

For the theoretical part I have two images that show how the aggregation is done. The fist image shows a typical situation on how municipalities might split up over the years. In this example between 1991 and 2000 the municipality “d“ developed out of „a“ alone. Between 2000 and 2010 “b“ developed out of „a“ and „c“.

algorithm-municipality-time-series

Here we already get a notion of the complexity of the task since a new municipality often develops out of multiple others and group affiliation changes over  time. If we group these municipalities together we create aggregates from municipalities, that did not form an administrative unit in the past. This is a little drawback but there is no other way around it since “b“ might contain both, population data from „a“ and „c“.

Image 2 shows how the algorithm for creating the groups actually works.

algorithm-municipality-time-series2

For my analysis I used this Excel sheet from IBGE that contains information on how municipalities developed over time. IBGE also provides detailed, yearly information on the development of the “Malha municipal” in each state. With this raw-material it is theoretically possible to go back to any year of interest to get the minimum amount of aggregation necessary for your analysis. However, it takes some time to download and prepare the data in order to be comparable to the table linked above.

The results of my aggregation are shown below. You can download the Shapefile that contains the groups based on the year 1980,1991 and 2000. The data is based on the Malha Municipal from 2010 with three columns called “m_1980″,”m_1990” and “m_2000”. Those columns contain either the group affiliation or the geocode of the municipality if it was not affiliated to any group.

  • “m_1980” contains muncipalities that where part of a group in 1970 and where splitted during 1970 and 1980. Altogether 139 municipalities where created and 221 municipalities where part of the splitting process, hence they have a group affiliation.
  • “m_1991” contains muncipalities that where part of a group in 1980 and where splitted during 1980 and 1991. Altogether 263 municipalities where created and 497 municipalities where part of the splitting process, hence they have a group affiliation.
  • “m_2000” contains muncipalities that where part of a group in 1991 and where splitted during 1991 and 2000. Altogether 15 municipalities where created and 32 municipalities where part of the splitting process, hence they have a group affiliation.

Furthermore, you can find seven columns with the geocodes of those municipalities that gave origin to another municipality in each respective year.My proposed methodology gives the minimum amount of aggregation necessary.

In the data-set I utilized there were no new municipalities created between 2000 and 2010. I cannot tell for sure however, if this is due to a lack of actual creation of municipalities in the region or if the dataset from IBGE is outdated. Other sources point to the fact that at least between 2000 and 2010 the number of municipalities increased from 5507 to 5565. It is unclear however how many municipalities where created in the Amazon States. Checking with official shapefiles on the “malha municipal from IBGE” we observe that

  • in 2000: only 792 municipalities existed in the nine Amazon states
  • in 2007: 807  and
  • in 2015: 808

It is most probable therefore, that the data-set is somehow incomplete.

Last but not least let me point you to the fact, that besides the separation of municipalities also municipality boundaries and hence their areas changed during the years, which is not documented in a comprehensive way and might pose a problem depending on your type of analysis. You see a comparison in in the last image of the Acrean municipalities in 1991 (purple with black boundaries) and 2010 (gray with red boundaries).

changingmunbound

One might assume that aggregation based on intersects might be also an adequate tool for this problem, however if combining intersections and the grouping based on the evolution of the “Malha Municipal”, the groups might get too many members to be useful in your analysis.

Making a movie with R

Since some models (like mine) do not always give a direct interface with results, R provides us with a powerful tool to play around with big piles of data models spit out. Obviously, you also may want to do this despite existing model interfaces!  The program R is making life often happy due to its endless possibilities. Since I am performing a vegetation model with individual trees modelled, one of my wishes was to see the forest in real life and 3D. So the forest is growing, waning, stabilizing and so forth. Making a movie with R!

All right, I directly admit: the movie is not made with R, but is calling another program from outside R; namely Image Magick in combination with the FFMPEG software. Nevertheless the code is totally in R and it works really nice.
If you have time-dependent graphs, plots, maps or whatsoever, you can use this neat function to make a moving pictures file. Of course you can also use a sleeper function within the R studio display, but it is easier for use to make a seperate movie.

First I will show you my personal code, which is making a movie of a growing tree stand and then explain it command by command:

# MAKING THE GRAPHS THAT FORM THE BASE FOR THE MOVIE!
TOT=floor(tlen/interval)    # decide on number of graphs needed, based on total time and time steps
for(step in 0:TOT){      # start the loop of making graphs here!
  t=1+(interval*step)             # step counter  
  # decide on some specific graph-parameters before and put them in 1 dataframe
  TreeStruct=data.frame(gridyx1,gridyx2,HTrees,CTrees,STrees,FTrees) 
  
  # USING LOGICAL NUMBERING FOR THE GRAPHS HERE!
  # open the file in which the graph is drawn: the following if-statements are to make sure the graphs are numbered well, 
  # so 100 is not appearing before 1 and so on...
  if(step<10){             # make sure generated graphs are numbers from 1-999
    jpeg(filename=sprintf("%sMovie/%s/Flash_00%s.jpeg",Folder,nameshort,step), width=1000, height=800, units="px")
  } else if(step<100){
    jpeg(filename=sprintf("%sMovie/%s/Flash_0%s.jpeg",Folder,nameshort,step), width=1000, height=800, units="px")
  } 
  
  # THE REAL GRAPH MAKING!
  with(TreeStruct, {
    s3d <- scatterplot3d(gridyx1, gridyx2,HTrees,             # x y (grids) and z (height) axis in scatterplot3d package
                         color=CTrees, pch=FTrees,            # color (species) and form (canopyform) of symbols
                         cex.symbols=STrees,                  # symbol size (canopy size)
                         zlim=c(0:40),# alpha=0.2,                        # maximum value for z-axis (height)
                         type="h", lty.hplot=2,               # lines to the horizontal plane and its line type (2=dashes)
                         grid=TRUE, main=sprintf("A Hectare of Tree Growth \n (time = %s)",t), # include the grid and graph title
                         xlab="", ylab="", zlab="Height (m)") # axes titles
                     })
  # Sys.sleep(0.5)      # possibility to let the movie run in the R-studio plot (might be disfunctional when using the whole script)
  graphics.off()   }    # the loop for making graphs ends here!
# MAKING THE REAL MOVIE HERE! USE IMAGE MAGICK AND FFMPEG SOFTWARE FOR LINUX
# create morphing images with same size (morph 3 = 3 images per graph):
cmd_morph <- paste0("convert ", sprintf("%sMovie/%s/*.jpeg",Folder,nameshort),
                    " -morph 3 ", sprintf("%sMovie/%s/",Folder,nameshort), "%05d.morph.jpg")
# create the movie from the morph images (-r 10 = movie speed 10 fps; -qscale 2 = quality class 2):
cmd_mov <- paste0("ffmpeg -r 10 -i ",sprintf("%sMovie/%s/",Folder,nameshort), 
                  "%05d.morph.jpg -qscale 2 ", sprintf("%sMovie/%s/%s_%.0f%.0f.mp4",Folder,nameshort,nameshort,x,y))
# run the command lines created above in linux terminal
system(cmd_morph) 
system(cmd_mov)

As one can see, I first make the graphs, that form the basis of the movie. I did not provide all details here, since it would make the script too long, due to the specifics of trees. Of course you can skip this all, when you have your own nice graphs. Just be aware you put them in a logical numbering order to read for the movie maker program later!
I also wanted to show here the s3d plotting function of R, which is really cool. It makes it possible to have 3D graphs. There are even options to look at the whole grid from different angles. (do not forget to download and call the package at the start of your R-script)

So, the real movie making starts at the very end of this script with the programs Image Magick and FFMPEG. In R you can create a command using the paste0-function, which is then called on the system with “system(command line)”.
We use two commands: the first I call “cmd_morph”, which creates the morphed pictures, so the movie is flowing; and the second one I call “cmd_mov”, which in fact makes the movie.

> cmd_morph <- convert inputfile/*.jpeg -morph x outputfile/%05d.morph.jpg
x = number of frames (morphed pictures) in between input graphs – the higher this number, the more flowinf the movie will be, but also the more time it needs to generate – standard is 2 to 5.
inputfile/ = file address of input graphs in jpeg or other picture type – take care this file only entails the graphs for the movie (or use inputfile/rootname*.jpeg).
outputfile/ = file address, where morphed output files will be stores.
%05d.morph.jpg = name of output file using 5 numbers (for max. 99999 frames) – change this 5 into 3 (max 999 frames) or any other number, depending on the frame number x.

> cmd_mov <- ffmpeg -r a -i inputfile/%05d.morph.jpg qscale b outputfile/name.mp4
a = the r-factor, meaning the number of frames per second – standard is 10.
b = the qscale-factor, meaning the quality class of the movie – standard is 2.
inputfile/05d.morph.jpg = file addres of the morphed input files, that resulted from the previous command cmd_morph.
Outputfile/name.mp4 = file address for the movie – consider a specific name for the location and time.

Hope this bit of code has entertained you a little bit.

Import multiple files to R

I have been recently asked a few times how you can import a bunch of data (let’s say for example .csv files) to your R-Environment without copying and pasting a lot of code. I’m not aware of a built-in-package in R that does that for you (although I can imagine that somewhere there might be one) but I will show a little example on how you can do this manually. The advantage is that you can easily modify the code to import other file-types and if you are a beginner with R you might get some feeling for automating, loops and lists in R.

The main idea behind the following code is, that you put all your files into one directory and read them into R with a loop. Therefore you will have to work with lists which serve as a “container” to receive the incoming data. Note that your .csv files need to have the same characteristics in order to automate the process. If you have for example  csv. files with different separators (one with commas and another with semicolons) the import will not work as expected. If you have never imported data into R before, try the read functions on single files before you go to automating.

The code is quite self-explanatory:


## import_multiple_csv_files_to_R
# Purpose: Import multiple csv files to the Global Environment in R

# set working directory
setwd("~/R/projects/tutorials/import_multiple_data_to_R/")

# list all csv files from the current directory
list.files(pattern=".csv$") # use the pattern argument to define a common pattern  for import files with regex. Here: .csv

# create a list from these files
list.filenames<-list.files(pattern=".csv$")
list.filenames

# create an empty list that will serve as a container to receive the incoming files
list.data<-list()

# create a loop to read in your data
for (i in 1:length(list.filenames))
{
list.data[[i]]<-read.csv(list.filenames[i])
}

# add the names of your data to the list
names(list.data)<-list.filenames

# now you can index one of your tables like this
list.data$deforestation2010.csv

# or this
list.data[1]

# you can make a function out of this
import.multiple.csv.files<-function(mypath,mypattern,...)
{
tmp.list.1<-list.files(mypath, pattern=mypattern)
tmp.list.2<-list(length=length(tmp.list.1))
for (i in 1:length(tmp.list.1)){tmp.list.2[[i]]<-read.csv(tmp.list.1[i],...)}
names(tmp.list.2)<-tmp.list.1
tmp.list.2
}

# use it like this
csv.import<-import.multiple.csv.files("~/R/projects/tutorials/import_multiple_data_to_R/",".csv$",sep=",")
# note: with ... we enable the function to refine the import with parameters from read.csv.
# here we define the separator of entries in the csv files to be comma.

# save it to the folder with your custom functions
save(import.multiple.csv.files,file="~/R/functions/import.multiple.csv.files.RData")

# load it like this whenever you need it in another script with
load("~/R/functions/import.multiple.csv.files.RData")

# end

Automated Mapping in R

After creating our base-layers within a PostGIS database as described here we used R to map our layers and create a plot showing the area count for each year from 1920 to 2012. In total we created 1840 plots (92 years x 20 buffers) that where then combined to produce this video.

The source code for the mappings is complex and wont be discussed in detail here but a brief description of the single steps as well as the code is provided for anybody interested in R-Programming:

A. Session Setup
Here we set our working directory, load all required libraries and suppress scientific notations (later-on important for the plots)

B. Load the data
We load all relevant data.

  • ucs_cum is a table that contains the area counts for all years and all protected area classes. We will need it for the area count on the left side of the video.
  • brasil and world.countries are two shapefiles that we need for our plots. World.countries contains the administration boarders for  all countries worldwide and brasil the boarders for Brazil. We fortify these shapefiles to make them understandable and plottable for ggplot2.
  • map.picture3 is the picture on the upper left of the video. We rasterize it to make it plottable.
  • Next we load all shapefiles that we need for the maps. All in all we have 3 different protected areas with each 20 shapefiles. To avoid too much code we have to automatize this follwing a simple three step approach: First we create a list with all filenames in a loop. Second we create an empty list where we will store our imports. Third we import the shapefiles with a loop thereby using the list of file-names. shapefiles.list.import contains all layers for all multiple use areas. shapefile.list.upi contains all layers for strictly protected areas and shapefile.list.ind contains all layers for indigenous territories. Afterwards we rearrange the order of the list and plot the minimum years because we will need this information later.

C: Construct parameters for the lineplot

Here we calculate the extension of all protected areas (total_protected_area_sqkm) and the extension of indigenous areas + strictly protected areas (total_uus_ind_sqkm). We need this information later  for the areacount plot.

D: Define viewports for the plots

We define different Viewports for the plots. Inside these Viewports we will later paste the different parts of our video (the map, the area count and the picture)

E: Define color schemes for the plots

We define custom colors for the different protected areas.

F: Define the different themes for the plots

We define how our plots made with ggplot2 should be lay-outed. Actually this piece of code was only created after we created the plots which come later. We externalize this part of the code because the plot function that follows afterwards is very long. If you do testplots and want to adjust small parameters in the layout, it might be an advantage to have the layout outside of the plot function.

G: Define additional parameters for the plots

Here we define the plot-title.

H: Construct the plotfunction

Now it gets tricky. In this function we do the following things.

  1. First we create additional layers that contain the information about protected areas (tmp.data.unbuffered.1-3). These layers will display all protected areas that have been created until the respective year. We have to do it with an if-else statement since we don’t know if there is any data for the given year.
  2. Second we create additional layers that contain information only about the respective year. In this layer we plot our buffers that indicate the newly created areas (tmp.data1-3). Again we have to do it with an if-else statement since we don’t know if there is any data for the given year.
  3. We construct the mapplot by adding the baselayers (countries worldwide, Brazil) and additional layers from 1 and 2. We furthermore add limits, title, projection and theme configurations to the plot.
  4. We construct the lineplot (areacount) with geom_ribbon from ggplot2.
  5. We define the area count as a title of the lineplot and modify the plot.
  6. We load the picture
  7. We plot our plots. Therefore we first open a new png file. Inside this file we create a Grid where we push the Viewports. In each Viewport we place our plots.
  8. We turn the device off and finish our plotting function.

I: Plot the data

We plot the data iteratively over all years and all buffersizes. We therefore define the output path and create two vectors for the iteration. One containing the years to be plotted and another one containing the alpha levels for the different buffer sizes (if you watch the video carefully you might notice that the alpha level changes with buffersize).

We then plot everything which takes about 4 hours to complete.

J: Close the session

We close the session and save all created objects from the Global environment.

 

# protected_areas_plotting_1
# Author: Johannes Schielein
# Outline:
  # A. Session Setup
  # B. Load the data
  # C: Construct parameters for the lineplot
  # D: Define viewports for the plots
  # E: Define color schemes for the plots
  # F: Define the different themes for the plots
  # G: Define additional parameters for the plots
  # H: Construct the plotfunctions
  # I: Plot the data
  # J: Close the session

# ----------- A. Session Setup -----------
# set wd
setwd("~/nas/documents/programming/R/protected_areas_plotting/")

# load workspace (if already saved once)
load("r-workspaces/protected_areas_plotting_1.Rdata")

# activate required libraries
library("rgdal")
library("RODBC")
library("mapproj")
library("ggplot2")
library("gridExtra")
library("scales")
library("knitr")
library("extrafont")
#font_import() # only if not already imported
library("png")

# surpress cientific notation
options(scipen=999)

# ----------- B.Load the data -----------
### Load the base data
# load cumulative uc data
ucs_cum<-read.csv("statistics/statistics3112.csv")

# shp brasil
brasil<-readOGR("World Countries 2/","subset_brazil")
brasil.fortified<-fortify(brasil)

# shp worldcountries
world.countries<-readOGR("World Countries 2/","sa")
world.countries.fortified<-fortify(world.countries)

### load the PNG foto for the plots
map.picture3<-readPNG("pics/fotos3.png") #load fotos
map.picture3.rastered <-rasterGrob(map.picture3)

### Load all uus shapefiles
## create an interation vector with the names endings
# the endings of all shapfile names
shapefile.iteration.vector<-vector(length=20) #create the vector
shapefile.iteration.vector[1]<-0 # set the first place zero
for (i in 2:20)shapefile.iteration.vector[i]<-shapefile.iteration.vector[i-1]+5 #set the rest places according to the names endings
shapefile.iteration.vector #show the vector
## creat a vector with the complete names
shapefile.list<-vector(length=20) # create the vector
# create a list with the shapefile names
for (i in shapefile.iteration.vector)shapefile.list[i/5]<-paste("_uus_buffer",i,sep="") # paste beginning and end of names
shapefile.list[20]<-("_uus_buffer0") # _uus_buffer0 was lost so set it automatically
shapefile.list
## import all shapefiles to the list
shapefiles.list.import<-list(length=20)
for (i in 1:20)shapefiles.list.import[[i]]<-readOGR("buffer_shps/",shapefile.list[i])

### Load all upi shapefiles
## create an interation vector with the names endings
## creat a vector with the complete names
shapefile.list.upi<-vector(length=20) # create the vector

# create a list with the shapefile names
for (i in shapefile.iteration.vector)shapefile.list.upi[i/5]<-paste("_upi_buffer",i,sep="") # paste beginning and end of names
shapefile.list.upi[20]<-("_upi_buffer0") # _uus_buffer0 was lost so set it automatically
shapefile.list.upi
## import all shapefiles to the list
shapefiles.list.import.upi<-list(length=20)
for (j in 1:20)shapefiles.list.import.upi[[j]]<-readOGR("buffer_shps/",shapefile.list.upi[j])

### Load all indigenous shapefiles
## create an interation vector with the names endings
## creat a vector with the complete names
shapefile.list.ind<-vector(length=20) # create the vector

# create a list with the shapefile names
for (i in shapefile.iteration.vector)shapefile.list.ind[i/5]<-paste("_ind_buffer",i,sep="") # paste beginning and end of names
shapefile.list.ind[20]<-("_ind_buffer0") # _uus_buffer0 was lost so set it automatically
shapefile.list.ind

## import all shapefiles to the list
shapefiles.list.import.ind<-list(length=20)
for (k in 1:20)shapefiles.list.import.ind[[k]]<-readOGR("buffer_shps/",shapefile.list.ind[k])

### rearrange the shapefiles lists to have a better naming
shapefiles.list.import.2<-list()
shapefiles.list.import.ind.2<-list()
shapefiles.list.import.upi.2<-list()
shapefiles.list.import.2<-rev(shapefiles.list.import[1:19])
shapefiles.list.import.2[[20]]<-shapefiles.list.import[[20]]
shapefiles.list.import.ind.2<-rev(shapefiles.list.import.ind[1:19])
shapefiles.list.import.ind.2[[20]]<-shapefiles.list.import.ind[[20]]
shapefiles.list.import.upi.2<-rev(shapefiles.list.import.upi[1:19])
shapefiles.list.import.upi.2[[20]]<-shapefiles.list.import.upi[[20]]

# remove old shapefilelists
rm(shapefiles.list.import,shapefiles.list.import.ind,shapefiles.list.import.upi)

# find out the first year of the different type of protected areas
min(shapefiles.list.import.2[[20]]$DATE_CREAT,na.rm=T) # uus = 1927
min(shapefiles.list.import.ind.2[[20]]$DATE_CREAT,na.rm=T) # ind = 1926
min(shapefiles.list.import.upi.2[[20]]$DATE_CREAT,na.rm=T) # upi = 1896

### ----------C: construct parameters for the lineplot----------

# calculate sqkm for ucs cumulatice
for (i in 2:5)
  ucs_cum[paste((names(ucs_cum)[i]),"_sqkm",sep="")]<-ucs_cum[,i]/1000000

tail(ucs_cum)

# calculate the total protected area
ucs_cum$total_protected_area_sqkm<-ucs_cum$uus_m2_sqkm+ucs_cum$upi_m2_sqkm+ucs_cum$ind_m2_sqkm

# calculate uus+ind
ucs_cum$total_uus_ind_sqkm<-ucs_cum$uus_m2_sqkm+ucs_cum$ind_m2_sqkm

### --------- D: define viewports for the plots ----------
# define first plot area
vp1 <- viewport(x = 1.05, y = 0.133,
                height = 0.72, width = 0.72,
                just = c("right", "bottom"),
                name = "map"
)
# define second plot area
vp2 <- viewport(x = 0.05, y = 0.505,
                height = 0.3, width = 0.361,
                just = c("left", "bottom"),
                name="picture"
)
# define third plot area
vp3 <- viewport(x = 0.05, y = 0.12,
                height = 0.36, width = 0.361,
                just = c("left", "bottom"),
                name="lineplot"
)

# define fourth plot area
vp4 <- viewport(x = 0.5, y = 0.95,
                height = 0.1, width = 0.8,
                just = c("center", "center"),
                name="titel"
)

### ---------- E: define color schemes for the plots ----------
color.uus<-"#99FF33"
color.upi<-"#006600"
color.ind<-"#CC6600"

### --------- F: define the different themes for the plots ----------
## for additional information on how to manage them themes see the ggplot2 help
# http://docs.ggplot2.org/current/theme.html

# plot 1 = maps
theme.plot.1<-list(theme(panel.background = element_rect(fill = "lightskyblue2", color="black"),
                          panel.grid.minor = element_blank(),
                          panel.grid.major = element_blank(),
                          axis.text.x = element_text(size=25,colour="white"),
                          axis.text.y = element_text(size=25,colour="white"),
                          axis.title.x= element_text(size=25,colour="white"),
                          axis.title.y= element_text(size=25,colour="white"),
                          plot.title = element_text(size=75,vjust=3,family="DejaVu Sans Condensed",colour="white"),
                          plot.background = element_rect(fill="black",colour="black")
                         )
                   )
# plot 2 = lineplot
theme.plot.2<-list(theme(panel.grid.minor = element_blank(),
                         panel.grid.major = element_blank(),
                         plot.title = element_text(size=60,vjust=1,family="DejaVu Sans Condensed",colour="white"),
                         axis.text.x=element_text(size=25,angle=45,vjust=0.7,hjust=0.9,colour="white"),
                         axis.text.y=element_blank(),
                         axis.title.x=element_blank(),
                         axis.title.y=element_blank(),
                         panel.background=element_rect(fill = "black"),
                         legend.position=c(0,1),
                         legend.justification = c(0, 1),
                         legend.title=element_blank(),
                         legend.text=element_text(size=25,family="DejaVu Sans Condensed",colour="white"),
                         legend.key.height=unit(3,"line"),
                         legend.key.width=unit(3,"line"),
                         legend.background = element_rect(fill="black",colour="black"),
                         axis.ticks.length = unit(0.3, "cm"),
                         axis.line=element_line(),
                         axis.line.y = element_blank(),
                         plot.background = element_rect(fill="black",colour="black")
                         )
                   )

### --------- G: define additional parameters for the plots ----------
# the title
plot.title<-textGrob("Protected Areas in Brazil",
                     gp = gpar(fontfamily="DejaVu Sans Condensed",
                               fontsize = 90,
                               col="white"))

#### ---------- H: Construct the plotfunction ----------

plot.function.buffer<-function(my.path,
                               my.year,
                               data.unbuffered.1,
                               data.unbuffered.2,
                               data.unbuffered.3,
                               my.data.1,
                               my.data.2,
                               my.data.3,
                               my.buffer.size,
                               my.alpha,
                               my.picture)
{
  ### 1. fortify all shapefiles for the plot

  ## 1.1 fortify all existing protected areas until the year (unbufferd)
  # layer 1 unbuffered
  if(min(data.unbuffered.1@data$DATE_CREAT,na.rm=T)<=my.year+10000){
    tmp.data.unbuffered.1<-fortify(data.unbuffered.1[which
                                                     (data.unbuffered.1@data$DATE_CREAT<=my.year+10000),])}
  else{tmp.data.unbuffered.1<-"no area for first unbuffered layer"}
  # layer 2 unbuffered
  if(min(data.unbuffered.2@data$DATE_CREAT,na.rm=T)<=my.year+10000){
    tmp.data.unbuffered.2<-fortify(data.unbuffered.2[which
                                                     (data.unbuffered.2@data$DATE_CREAT<=my.year+10000),])}
  else{tmp.data.unbuffered.2<-"no area for second unbuffered layer"}
  # layer 3 unbuffered
  if(min(data.unbuffered.3@data$DATE_CREAT,na.rm=T)<=my.year+10000){
    tmp.data.unbuffered.3<-fortify(data.unbuffered.3[which
                                                     (data.unbuffered.3@data$DATE_CREAT<=my.year+10000),])}
  else{tmp.data.unbuffered.3<-"no area for third unbuffered layer"}

  ## 1.2 fortify protected areas for the respective year if there was a creation/data is available (with buffer)
  # layer 1 buffered
  if(nrow
     (my.data.1[[my.buffer.size]]@data[which
                                       (my.data.1[[my.buffer.size]]@data$DATE_CREAT>=my.year &
                                          my.data.1[[my.buffer.size]]@data$DATE_CREAT<=my.year+10000),])>0
  )
  {tmp.data1<-fortify(my.data.1[[my.buffer.size]][which
                                                  (my.data.1[[my.buffer.size]]@data$DATE_CREAT>=my.year &
                                                     my.data.1[[my.buffer.size]]@data$DATE_CREAT<=my.year+10000),])}else
                                                     {tmp.data1<-"no area for first layer"}
  # layer 2 buffered
  if(nrow
     (my.data.2[[my.buffer.size]]@data[which
                                       (my.data.2[[my.buffer.size]]@data$DATE_CREAT>=my.year &
                                          my.data.2[[my.buffer.size]]@data$DATE_CREAT<=my.year+10000),])>0
  )
  {tmp.data2<-fortify(my.data.2[[my.buffer.size]][which
                                                  (my.data.2[[my.buffer.size]]@data$DATE_CREAT>=my.year &
                                                     my.data.2[[my.buffer.size]]@data$DATE_CREAT<=my.year+10000),])}else
                                                     {tmp.data2<-"no area for second layer"}

  # layer 3 buffered
  if(nrow
     (my.data.3[[my.buffer.size]]@data[which
                                       (my.data.3[[my.buffer.size]]@data$DATE_CREAT>=my.year &
                                          my.data.3[[my.buffer.size]]@data$DATE_CREAT<=my.year+10000),])>0
  )
  {tmp.data3<-fortify(my.data.3[[my.buffer.size]][which
                                                  (my.data.3[[my.buffer.size]]@data$DATE_CREAT>=my.year &
                                                     my.data.3[[my.buffer.size]]@data$DATE_CREAT<=my.year+10000),])}else
                                                     {tmp.data3<-"no area for third layer"}

  ### 2. construct the mapplot
  ## 2.1 add the base layers
  tmp.plot<-ggplot(,aes(long,lat))+
    geom_map(aes_string(map_id="id"),
             map=world.countries.fortified,
             data=world.countries.fortified,
             fill="grey80",
             colour="black",
             size=0.3)+
    geom_map(aes_string(map_id="id"),
             map=brasil.fortified,
             data=brasil.fortified,
             fill="lemonchiffon2",
             colour="black",
             size=0.3)

  ## 2.2 add all existing protected areas until the year unbufferd
  if (is.data.frame
      (tmp.data.unbuffered.1)==TRUE)
  {tmp.plot<-
     tmp.plot+geom_map(aes_string(map_id='id'),
                       map=tmp.data.unbuffered.1,
                       data=tmp.data.unbuffered.1,
                       fill=color.ind,
                       colour="black",
                       size=0.3)}
  # else{print(tmp.data.unbuffered.1)} # activate this option if you want to trace the layers within each map
  if (is.data.frame
      (tmp.data.unbuffered.2)==TRUE)
  {tmp.plot<-tmp.plot+geom_map(aes_string(map_id='id'),
                               map=tmp.data.unbuffered.2,
                               data=tmp.data.unbuffered.2,
                               fill=color.uus,
                               colour="black",
                               size=0.3)}
  # else{print(tmp.data.unbuffered.2)}
  if (is.data.frame
      (tmp.data.unbuffered.3)==TRUE)
  {tmp.plot<-tmp.plot+geom_map(aes_string(map_id='id'),
                               map=tmp.data.unbuffered.3,
                               data=tmp.data.unbuffered.3,
                               fill=color.upi,
                               colour="black",
                               size=0.3)}

  ## 2.3 add protected areas with buffer if data is available for the respective year
  # layer ind
  if (is.data.frame(tmp.data1)==TRUE){tmp.plot<-
                                        tmp.plot+geom_map(aes_string(map_id='id'),
                                                          map=tmp.data1, data=tmp.data1,
                                                          fill=color.ind,
                                                          colour="black",
                                                          size=0.3,
                                                          alpha=my.alpha)}
  # layer uus
  if (is.data.frame(tmp.data2)==TRUE){tmp.plot<-
                                        tmp.plot+geom_map(aes_string(map_id='id'),
                                                          map=tmp.data2, data=tmp.data2,
                                                          fill=color.uus,
                                                          colour="black",
                                                          size=0.3,
                                                          alpha=my.alpha)}
  # layer upi
  if (is.data.frame(tmp.data3)==TRUE){tmp.plot<-
                                        tmp.plot+geom_map(aes_string(map_id='id'),
                                                          map=tmp.data3, data=tmp.data3,
                                                          fill=color.upi,
                                                          colour="black",
                                                          size=0.3,
                                                          alpha=my.alpha)}

  ## 2.4 add limits, title, projection and theme configurations to the plot
  tmp.plot<-tmp.plot+
    xlim(-80,-30)+
    ylim(-33,5)+
    ggtitle(my.year/10000)+
    coord_map(projection="mercator")+
    theme.plot.1

  ### 3. construct the lineplot
  ## 3.1 plot
  tmp.plot2<-
    ggplot(data=ucs_cum[which(ucs_cum$year<my.year/10000),])+
    geom_ribbon(aes(x=year,# is going to be area for upi
                    ymin=total_uus_ind_sqkm,
                    ymax=total_protected_area_sqkm,
                    fill=color.upi,
                    color=color.upi))+
    geom_ribbon(aes(x=year,# is going to be area for uus
                    ymin=ind_m2_sqkm,
                    ymax=total_uus_ind_sqkm,
                    fill=color.uus,
                    color=color.uus))+
    geom_ribbon(aes(x=year,#is going to be area for ind
                    ymin=0,
                    ymax=ind_m2_sqkm,
                    fill=color.ind,
                    color=color.ind))+

    ## 3.2. area count as title
    ggtitle(
      paste(
        format(
          round(
            ucs_cum$total_protected_area_sqkm[which(iteration.string.plot==my.year)],
            digits=0), # together with round cuts the digits off
          big.mark=","), # together with format creates the thousend seperators)
        "km²"
      ))+
    ## 3.3 modify the plot
    # scales and colours
    scale_fill_manual(values=c(color.upi,color.uus,color.ind),
                      labels=c("  Strictly Protected Areas","  Multiple Use Reserves","  Indigenous Areas"))+
    scale_colour_manual(values=c(color.upi,color.uus,color.ind),
                        labels=element_blank(),
                        guide="none")+ # to surpress the legend of the outlines
    scale_x_continuous(breaks=seq(1920, 2010, by = 10),
                       limits=c(1920,2013), # set value limits of axis
                       expand=c(0, 0))+ # set visible limits of axis
    scale_y_continuous(breaks=NULL,
                       limits=c(0,2800000))+
    # add the themes
    theme.plot.2

  ### 4. Construct the plot with extra information
  tmp.plot3<-map.picture3.rastered

  ###  Combine the plots and make png outputs
  ## new png
  png(filename=paste(my.path,"map_",my.year,"_buffer",my.buffer.size,".png",sep=""), width=2560, height=1600)
  ## mapplot

  # mapplot
  grid.newpage()# clear plot area
  grid.rect(gp = gpar(fill = "black")) #show the plotting region (viewport extent)
  pushViewport(vp1) # enter vp1
  print(tmp.plot, newpage = FALSE) # plot1
  upViewport(1) # leave vp1

  # lineplot
  pushViewport(vp3)
  print(tmp.plot2, newpage = FALSE)
  upViewport(1)

  # mapinfo
  pushViewport(vp2)
  print(grid.draw(my.picture), newpage=FALSE)
  upViewport(1)

  # title
  pushViewport(vp4)
  print(grid.draw(plot.title), newpage=FALSE)
  upViewport(1)

  ## Turn the device off to finish the print
  dev.off()

  ### Finishing the function
  print(paste("Map completed for",my.year/10000,"with buffersize",my.buffer.size,"and opacity set to",my.alpha))
}

#### --------- I: Plot the data ----------

# define the path for the plot outputs
path.plot<-"r-outputs/"

# create a string to iterate over the input data ucs year by year
iteration.string.plot<-vector(length=length(1896:2013))
iteration.string.plot[1]<-18960000
for (i in 2:length(1896:2013))iteration.string.plot[i]<-iteration.string.plot[i-1]+10000
iteration.string.plot

# create a vector that contains the transparency/alpha values for each buffer
alpha.vector.one<-seq(from=0.5,to=1,length.out=20)

# plot all years and buffers (over 2gb size!)
for (j in 2:length(iteration.string.plot)){
  for (i in 19)
  {plot.function.buffer(path.plot,
                        iteration.string.plot[j],
                        shapefiles.list.import.ind.2[[20]],
                        shapefiles.list.import.2[[20]],
                        shapefiles.list.import.upi.2[[20]],
                        shapefiles.list.import.ind.2,
                        shapefiles.list.import.2,
                        shapefiles.list.import.upi.2,
                        i,
                        alpha.vector.one[i],
                        map.picture3.rastered)
  }
  Sys.sleep(3)
}

# ---------- J: Close session----------
save.image("r-workspaces/protected_areas_plotting_1.Rdata")

PostGIS easy buffers

To create the video Evolution of protected areas in Brazil we created buffers around all protected areas with 20 different buffer sizes. For each of the 3 protected areas types (indigenous terretories, multiple use reserves, stricly protected areas) we used a different layer containing the polygons for each protected area. A display of the 20 buffers in each year within one second creates the impression of a continuous shrinking. The PostGIS code to create a single buffer for one layers is as follows:

create table uso_sustentavel_95 as
select
 gid,
 creation_date,
 st_buffer(geom,0.95) as geom
from uso_sustentavel_95

We saved the multiple use reserve shapefile on a PostgresSQL server with a PostGIS plugin, named uso_sustentavel. Every territory has an identifier called gid. The creation_date in our dataset was constructed from each decree that inaugurated the reserve. With each decree the publication day is provided. The geom column refers to the information about the geographic properties of each polygon. The st_buffer() creates automatically areas with a buffer of 0.95 degrees around all polygons. To create the 20 buffer zones automatically I used a loop created in STATA12.