Home | Business News | Browse by Publication | C | Cartography and Geographic Information Science

Mapping population distribution in the urban environment: the Cadastral-based Expert Dasymetric System (CEDS).

Publication: Cartography and Geographic Information Science
Publication Date: 01-APR-07
Format: Online
Delivery: Immediate Online Access
Full Article Title: Mapping population distribution in the urban environment: the Cadastral-based Expert Dasymetric System (CEDS).(Author abstract)

Article Excerpt
What is Dasymetric Mapping?

Dasymetric mapping refers to a process of disaggregating spatial data to a finer unit of analysis, using additional (or "ancillary") data to help refine locations of population or other phenomena (Mennis 2003). This disaggregation process will result in areas of homogeneity that take into account (and more closely resemble) the actual phenomena being modeled, rather than areal units based on administrative or other arbitrary boundaries. Although it is generally used to get better results for actual locations of population, dasymetric mapping theoretically can be used to disaggregate any quantitative variable that is aggregated by geographic units, such as administrative divisions including census enumeration units, ZIP codes, counties, and police precincts; or environmental districts, including watersheds, wetlands, or flood plains.

The locations of census tract boundaries, ZIP code postal zones, or any other administrative boundaries, do not necessarily relate to the underlying phenomena, having been created arbitrarily or to suit other governmental purposes. Population totals within a given areal zone are assumed to be distributed evenly throughout the zone, when in fact population distribution is generally much more heterogeneous (Wu et al. 2005). This creates errors when trying to establish accurate rates for GIS analyses pertaining to health studies, crime patterns, hazard/risk assessment, land-use planning, or environmental impacts, among others, that rely on a smaller unit of analysis than the original zones. Examples of this are impact buffers that intersect the census enumeration unit, or a different set of zones altogether that do not coincide with the original set (e.g., overlaying data from units with non-coincident boundaries and/or overlapping spatial units such as census tracts and police precincts or health districts).

Although dasymetric mapping has been in use since at least the early 1800s, it has never achieved the ubiquity of other types of thematic mapping, and thus the means of producing dasymetric maps have never been standardized and codified the way other types of thematic mapping techniques have been (Eicher and Brewer 2001; Slocum 1999). Therefore, dasymetric methods remain highly subjective, with inconsistent criteria. The reason for this relative lack of popularity and the paucity of standard methodology surely lies at least partially in the difficulty inherent in constructing dasymetric maps, and until recently, the difficulties in obtaining the necessary data, as well as access to the computer power required to generate them.

The dasymetric method we have developed uses census data in conjunction with cadastral (tax lot) data in order to create a more precise picture of where people actually live. Using data aggregated by census enumeration units assumes that population is distributed homogeneously throughout the unit, which is rarely the case in reality. This assumption of homogeneity results in incorrect denominators (counts of the total population affected by the phenomena being investigated) being used to calculate rates (disease, crime, impacted populations, and so forth), which in turn results in either an under-or overestimation of the risk or its occurrence.

The proposed Cadastral-based Expert Dasymetric System (CEDS) leads to a better estimation of population (and potentially of specific sub-populations), and thus to a more complete understanding of the spatial distribution and patterns of disease, crime, hazard, exposure, and other issues. Following our review below of some of the most frequently-used dasymetric methods, we will describe the CEDS method, and then present it through an example of mapping population distribution in New York City, by comparing choropleth mapping, areal interpolation, filtered areal weighting, and CEDS. We then further illustrate CEDS through a case study showing how the CEDS method improves an environmental health justice analysis of asthma in the Bronx.

Historical Background of Dasymetric Mapping

Many early cartographic endeavors were concerned predominantly with producing maps intended for navigational and exploration purposes; these required furthering our abilities to observe and measure the physical world with increasing levels of precision (Hall 1994). Technical advancements in instrument design and geometric theory made these more precise maps possible, and they generally portrayed tangible aspects of the physical world, such as areal sizes of geographic units, topography, temperatures, and sea depths (Dorling and Fairbairn 1997). Maps depicting social, cultural, or economic aspects of the world are termed thematic maps--those showing a particular "theme," such as poverty levels, disease rates, or the flow of migration. Thematic maps (also called statistical maps, if depicting a quantitative data theme) are generally of more recent vintage (Dent 1999). One of the earliest known examples of a thematic map is the mathematician Baron Pierre Charles Dupin's 1826 unclassed choropleth map showing illiteracy levels in each of the administrative departements comprising Dance, where the areal units were shaded in greytones, with the darker tones indicating a higher illiteracy rate (Robinson 1982: p. 232).

Although no real typology of thematic maps had been developed at that time, most of the major types of statistical graphics and thematic maps as we know them today originated in the first half of the 19th century as a means to visualize quantitative information. As national governmental powers grew and consolidated in this time period, the need arose for a more detailed view of the population and associated data related to population, such as numbers about health, crime, education, poverty, and economics (Koch 2005). Statistical mapping met this need, and for the first time, the types of data needed to produce these maps were collected and made available.

Milestones in dasymetric mapping would have to include Scrope's 1833 classed population density map of the world, which used a rudimentary dasymetric technique (Scrope 1833). However, the Russian geographer, Semenov-Tyan-Shansky (1827-1914), who studied under von Humboldt and Ritter in Berlin and advanced the use of statistical mapping, has often been credited with inventing the dasymetric map (Bielecka 2005). The American geographer, John Kirtland Wright (1891-1969), who was perhaps the first person to publish a paper on dasymetric mapping in an English-language journal, stated that dasymetric means "density measuring." His 1936 paper is generally considered the seminal paper on dasymetric mapping, in which he extolled the virtues of the dasymetric map over the choropleth map (Wright 1936). He also coined the term "choropleth" (value-by-area) map, although choropleth maps were in use since at least the early nineteenth century.

Today, the need for visualization of population data is even more necessary, not just for descriptive purposes--to show the geographic extent and density of populations--but also for spatial analytical and predictive modeling purposes, in order to inform risk assessments and public policy formation on many urban issues (Gregory 2000; Moon and Farmer 2001; Poulsen and Kennedy 2004; Sleeter, 2004). The more traditional thematic mapping techniques may not be sufficient to display and analyze these data. Choropleth mapping, one of the most widely used thematic map techniques today, has many benefits, but it is lacking in a few important ways. The choropleth method is familiar, and easily comprehended and interpreted by the map reader, and it is comparatively straightforward to compute. For instance, population density for a given enumeration unit can be normalized by dividing the total population by the areal measurement of the unit. However, drawbacks include the Modifiable Areal Unit Problem (MAUP), which describes the phenomenon that, by modifying areal boundaries and/or the level of data aggregation, the results of the spatial analysis will be substantially different (Openshaw 1984).

Choropleth maps also have a propensity to generalize the high and low values within a given enumeration unit, removing the spatial heterogeneity in the data values. Additionally, choropleth maps depict abrupt changes at the boundaries of enumeration units, which are based on the existence of artificially defined boundaries, and not boundaries defined by the reality of the data. Dasymetric maps can be subject to abrupt boundary changes as well, but "these transitions are a better reflection of the true underlying geography of the area than the transitions in choroplethic maps, which are artifacts partially attributable to the arbitrary delineation of areal boundaries. This limitation of dasymetric mapping is offset by the technique's better visualization of population patterns, due to the high degree of spatial disaggregation that can be achieved" (Holt et al. 2004, p. 104).

Methods and Data Used in Dasymetric Mapping

Transferring data from one set of geographic zones or districts to another set of non-coincident zones is often necessary in spatial analysis. For instance, we might have data on the number of people living within a certain census tract but need to estimate the number of people in a smaller area within the tract, or an area that includes only part of that tract and part of other tracts. We may be interested in population or other data at a watershed level and only have population data available at the census enumeration units. "In any one study, several different types of data may be collected at differing scales and resolutions, at different spatial locations, and in different dimensions" (Gotway and Young 2002, p. 632).

A typical example of this is the problem encountered when conducting spatial analysis on historical census data from various time periods, with each temporally different attribute data set using different spatial data as well, because the tract boundaries used to aggregate the attribute data may change with each census period (Gregory 2000). How can one determine the number of people living in only a portion of an area for which data have been aggregated, or in an area for which the zones containing the data of interest do not coincide among various data layers?

Several methods of disaggregating population data are discussed below: weighted areal interpolation; filtered weighted areal interpolation; the use of land use/land cover as ancillary data for filtering; three-class and limiting variable methods; "image texture" method; statistical approaches, such as regression-based methods; heuristic sampling; kernel density surface using weighted census centroids; the use of other types of ancillary data sets, such as street-weighted interpolation; and the proposed CEDS method.

Areal Interpolation

A common method for calculating disaggregated population values is areal interpolation. This is defined as "the transfer of data from one set (source units) to a second set (target units) of overlapping, non-hierarchical, areal units (Langford et al. 1991, p. 56). Areal interpolation is closely related to dasymetric mapping of population densities (Holt et al. 2004). The main difference between areal interpolation and dasymetric mapping is that with the later approach, the data are not re-aggregated into a desired enumeration unit as they are with areal interpolation (Eicher and Brewer 2001).

A simple method of areal interpolation is to weight the variable's values by a ratio derived from the relative areal measurements of the two types of zones (source and target) (Goodchild and Lam 1980). Areal weighting is based on the assumption that population (or another variable) is distributed homogeneously throughout the "source" zone (the original unit of data aggregation). The amount of population estimated to be in the intersecting zone (or "target" zone) is assumed to be proportional to the amount of area in the source zone versus the target zone. The ratio of area of source zone to target zone is then applied to population in the source zone to yield the population total in the target zone

In a study of areal interpolation for socioeconomic data, Goodchild et al. (1993) looked at a typical problem of spatial analysis using non-coincident areal units, namely the 58 counties of California (the source zones) and the state's 12 major hydrological basins (the target zones). The boundaries of the two sets of spatial units were, for the most part, incompatible. The socioeconomic data are available on the county level, but data connected with water issues are collected based on the hydrologic basin units that correspond to major watershed boundaries. In order to conduct a major economic impact study of water usage and policy, variables such as employment, income, and population had to be transferred from the county spatial units to the hydrological regions. Goodchild et al. (1993) used direct areal weighting to accomplish this, assuming that densities in the source zones (the counties) were uniform. When later comparing the results of the areal weighting method with other methods using statistical approaches, they found that areal weighting had a much higher...

View this article FREE - Now for a Limited Time, try Goliath Business News
Free for 3 Days!



More articles from Cartography and Geographic Information Science
Brief history of spatial information policies in the United States., April 01, 2007

Looking for additional articles?
Search our database of over 3 million articles.

Looking for more in-depth information on this industry?
Search our complete database of Industry & Market reports by text, subject, publication name or publication date.

About Goliath
Whether you're looking for sales prospects, competitive information, company analysis or best practices in managing your organization, Goliath can help you meet your business needs.

Our extensive business information databases empower business professionals with both the breadth and depth of credible, authoritative information they need to support their business goals. Whether it be strategic planning, sales prospecting, company research or defining management best practices - Goliath is your leading source for accurate information.