Sumário em português:
Analisando dados estatísticos do Brasil em escala municipal encontramos o problema da divisão de municípios ao longo dos anos. A evolução da malha municipal dificulta a análise de dados em series temporais por que os novos municípios não tem dados para os anos prévios a sua criação,- Além disso os velhos municípios que foram divididos não podem ser comparados antes e depois da sua divisão por que perderam território e gente. No seguinte texto apresento uma metodologia (em inglês) para lidar com este problema agregando os municípios em grupos baseados na evolução da malha municipal ao longo dos anos. Os grupos baseados nessa agregação podem ser baixados para os anos 1980, 1991 e 2000 referente à malha municipal de 2010. As agregações permitem analisar todo tipo de dado socioeconómico e demográfico do Instituto Brasileiro de Geografia e Estatística – IBGE e de outras instituições académicas e estaduais.
In statistical analysis we may encounter the problem that our units of observation change over time. This is true for the case of Brazilian municipality data where the units of observation change due to the fact that a lot of new municipalities were created during the last decades. In 2010 Brazil possessed 5567 municipalities of which 3991 where created between 1940 and 2010. Those new municipalities lack data for the previous years to their existence. If a municipality was e.g. created in 1994, there is no demographic data previous to 1994. Furthermore, since municipalities are not created out of nothing, one or multiple other municipalities might suddenly experience a decrease in population in the subsequent year (here 1995) because their areas and populations were divided. This is also true for a large set of other socio-economic variables that are produced in the studies of the Brazilian Institute for Geography and Statistics (IBGE) and others. If we analyze the development of municipalities over time we have to account for this problem in order to avoid artificial increases or decreases in our variables, that might be largely a legacy of administrative change.
In Brazil, municipality creation spurred especially in the 1980s in course of the decentralization policies and ongoing population increase in the rural and urban areas. Although IBGE provides information on the evolution of the so called “Malha Municipal“ I was not able to identify a method or data-set that specifically solved that problem.
Therefore, I developed a set of algorithms in R that help to group municipalities on a base year e.g. 1991 in order to create an aggregation level that is neutral to municipality change. I applied these algorithms for the years 1980,1991 and 2000, which are the years of the population censuses prior to 2010. Furthermore, I limited the analysis to the states of the Brazilian Amazon, since some manual edits have to be made to the raw data, if municipalities where e.g. misspelled by the staff that created the documentation on the evolution of the „Malha Municipal“. Doing it for the whole country with the R script should be still feasible with a few hours of extra work. If you are interested, I can provide you the script and some explanations on how to use it.
For the theoretical part I have two images that show how the aggregation is done. The fist image shows a typical situation on how municipalities might split up over the years. In this example between 1991 and 2000 the municipality “d“ developed out of „a“ alone. Between 2000 and 2010 “b“ developed out of „a“ and „c“.
Here we already get a notion of the complexity of the task since a new municipality often develops out of multiple others and group affiliation changes over time. If we group these municipalities together we create aggregates from municipalities, that did not form an administrative unit in the past. This is a little drawback but there is no other way around it since “b“ might contain both, population data from „a“ and „c“.
Image 2 shows how the algorithm for creating the groups actually works.
For my analysis I used this Excel sheet from IBGE that contains information on how municipalities developed over time. IBGE also provides detailed, yearly information on the development of the “Malha municipal” in each state. With this raw-material it is theoretically possible to go back to any year of interest to get the minimum amount of aggregation necessary for your analysis. However, it takes some time to download and prepare the data in order to be comparable to the table linked above.
The results of my aggregation are shown below. You can download the Shapefile that contains the groups based on the year 1980,1991 and 2000. The data is based on the Malha Municipal from 2010 with three columns called “m_1980″,”m_1990” and “m_2000”. Those columns contain either the group affiliation or the geocode of the municipality if it was not affiliated to any group.
- “m_1980” contains muncipalities that where part of a group in 1970 and where splitted during 1970 and 1980. Altogether 139 municipalities where created and 221 municipalities where part of the splitting process, hence they have a group affiliation.
- “m_1991” contains muncipalities that where part of a group in 1980 and where splitted during 1980 and 1991. Altogether 263 municipalities where created and 497 municipalities where part of the splitting process, hence they have a group affiliation.
- “m_2000” contains muncipalities that where part of a group in 1991 and where splitted during 1991 and 2000. Altogether 15 municipalities where created and 32 municipalities where part of the splitting process, hence they have a group affiliation.
Furthermore, you can find seven columns with the geocodes of those municipalities that gave origin to another municipality in each respective year.My proposed methodology gives the minimum amount of aggregation necessary.
In the data-set I utilized there were no new municipalities created between 2000 and 2010. I cannot tell for sure however, if this is due to a lack of actual creation of municipalities in the region or if the dataset from IBGE is outdated. Other sources point to the fact that at least between 2000 and 2010 the number of municipalities increased from 5507 to 5565. It is unclear however how many municipalities where created in the Amazon States. Checking with official shapefiles on the “malha municipal from IBGE” we observe that
- in 2000: only 792 municipalities existed in the nine Amazon states
- in 2007: 807 and
- in 2015: 808
It is most probable therefore, that the data-set is somehow incomplete.
Last but not least let me point you to the fact, that besides the separation of municipalities also municipality boundaries and hence their areas changed during the years, which is not documented in a comprehensive way and might pose a problem depending on your type of analysis. You see a comparison in in the last image of the Acrean municipalities in 1991 (purple with black boundaries) and 2010 (gray with red boundaries).
One might assume that aggregation based on intersects might be also an adequate tool for this problem, however if combining intersections and the grouping based on the evolution of the “Malha Municipal”, the groups might get too many members to be useful in your analysis.