GIS Files

For an overview of the geographic units and years covered by NHGIS GIS files, see Data Availability.

File format and geometry

NHGIS provides its geometry files for geographic information systems (GIS) as shapefiles, a standard spatial data file format. The shapefile format was originally defined for use in Esri GIS applications, but the format has become an industry standard, and many GIS and mapping tools are able to read and write shapefile data.

Most NHGIS GIS files have polygon geometries representing the boundaries of census reporting areas.

NHGIS also supplies two types of point files:

Geometry years

NHGIS generally identifies each GIS file by the survey year in which the file's represented areas were used for tabulations, which may be different than the vintage of the represented areas. For example:

  • The 2012 boundary file for Core Bases Statistical Areas (CBSAs) follows the official 2009 CBSA delineations, which are the delineations used in 2012 American Community Survey (ACS) tables.
  • The 2012-2019 boundary files for Public Use Microdata Areas (PUMAs) all identify 2010 PUMAs, which are the PUMAs used in 2012-2019 ACS tables.

This Census Bureau page identifies the vintages of geographic areas for each ACS survey year since 2009.

Note on 2009 census tracts and block groups:
To find the NHGIS boundary files for block groups and census tracts derived from 2009 TIGER/Line files, users should filter on the year 2000 in the Data Finder. NHGIS identifies these boundaries with 2000, not 2009, because they correspond to the boundaries of 2000 census units and are not completely consistent with the units identified in 2009 ACS tables. Most of the block group and census tract tables from the 2009 5-Year ACS Summary File correspond to the Census 2000 definitions, but according to ACS documentation, "in 19 counties from 8 different states, many of the census tracts and block groups used to tabulate and present the 2005-2009 ACS 5-year estimates are either those submitted to the Census Bureau for the 2010 Census, or a preliminary version of 2010 Census definitions." More information on these discrepancies, including a listing of affected counties, is available here.

Unfortunately, no TIGER/Line files represent the actual set of tracts and block groups identified in 2005-2009 ACS tables, so NHGIS does not provide boundary files for the "2009 vintage" of these units.

Back to Top

Derivation

Standard procedures for boundary files

NHGIS boundary files are derived primarily from the U.S. Census Bureau's TIGER/Line files with numerous additions to represent historical (1790-1980) boundaries that do not appear in TIGER/Line files. For more recent boundary files (1990 or later), NHGIS typically makes only a few key changes to the TIGER/Line source:

  1. We merge files that the Census provides only for individual states or counties to produce new nationwide or statewide files
  2. We project the data into Esri's USA Contiguous Albers Equal Area Conic Projected Coordinate System
  3. We add a “GISJOIN” attribute field, which supplies standard identifiers that correspond to the “GISJOIN” identifiers in NHGIS data tables
  4. We rename files to use the NHGIS naming style and geographic-level codes
  5. We add NHGIS-specific metadata in the XML file that accompanies the shapefile
  6. Most substantially, we erase coastal water areas to produce polygons that terminate at the U.S. coasts and Great Lakes shores

1980 and earlier boundaries based on 2000 TIGER/Line files

Because the 2000 TIGER/Line files contain no identifiers for census areas from 1980 and earlier, NHGIS researchers obtained boundary definitions for those years by consulting other sources, including 1992 TIGER/Line data for 1980 census tracts; maps from printed census reports for 1910-1980 census tracts and other small areas; and the Map Guide to the U.S. Federal Censuses, 1790-1920, by William Thorndale and William Dollarhide (Genealogical Publishing Co., Baltimore, MD, 1987), for counties and states back to 1790. Where the historical boundaries follow 2000 TIGER/Line features, the original NHGIS boundary files re-use those TIGER/Line features. Elsewhere, NHGIS researchers digitized new boundaries. NHGIS boundary files based on these files are identified as "2000 TIGER/Line +" in the Basis column in the Select Data grid of the Data Finder.

1980 place and county subdivision boundaries

1980 boundaries for places and county subdivisions are derived from the U.S. Census Bureau's 1992 TIGER/Line files. NHGIS modified the TIGER/Line definitions only by erasing coastal water areas using the 2000 TIGER/Line coastal water definitions. NHGIS boundary files based on these files have a "1992 TIGER/Line +" Basis in the Data Finder.

1980 block boundaries

The 1980 block boundaries are derived primarily from the U.S. Census Bureau's 1992 TIGER/Line files, but 1980 block definitions in these files are incomplete and often inaccurate. NHGIS has applied extensive manual editing, using 1980 paper block maps as a guide, to reconcile differences between the 1992 TIGER/Line files and the 1980 census block summary data.

At this time, the 1980 block boundary files are complete only for selected metropolitan areas, and the files are not yet available through the Data Finder. They are instead available through the 1980 Block Boundaries page along with complete documentation and block-level summary statistics.

2000 boundaries based on 2010 TIGER/Line files

NHGIS also provides 2000 boundaries derived from the 2010 TIGER/Line files, which have a "2010 TIGER/Line +" Basis in the Data Finder. For these, NHGIS used 2010 coastal water areas to clip the 2000 boundaries to U.S. coasts and Great Lakes shores. The 2000 boundaries derived from 2010 TIGER/Line will better align with 2010 and newer GIS boundary files.

2009 and later boundaries

NHGIS has generally derived 2009 and later boundaries from the corresponding TIGER/Line release, as indicated by the Basis in the Data Finder. In each case, we typically apply only our standard modifications for boundary files. For the 2009 boundary files, NHGIS used 2010 TIGER/Line coastlines to erase coastal water areas.

2010 boundaries based on 2020 TIGER/Line files

NHGIS also provides 2010 boundaries derived from the 2020 Census Redistricting Data (P.L. 94-171) TIGER/Line files, which have a "2020 TIGER/Line +" Basis in the Data Finder. For these, NHGIS used 2020 coastal water areas to clip the 2010 boundaries to U.S. coasts and Great Lakes shores. The 2010 boundaries derived from 2020 TIGER/Line will better align with 2020 and newer GIS boundary files.

School attendance areas

The NHGIS shapefiles for school attendance areas were constructed through the School Attendance Boundary Information System (SABINS) project, as detailed in the SABINS GIS Data Documentation.

Centers of population

NHGIS provides point files representing the 2000, 2010, and 2020 centers of population for states, counties, census tracts, and block groups. NHGIS derived these points from the U.S. Census Bureau's Centers of Population.

Each point represents the mean center of population within the corresponding area, computed as an average of census block locations, weighted by block population, using a simple spherical model of the Earth surface. As described in the Census Bureau's Centers of Population Computation documentation: "The center of population is the point at which an imaginary, weightless, rigid, and flat (no elevation effects) surface representation of [a geographic area] would balance if weights of identical size were placed on it so that each weight represented the location [of] one person."

NHGIS undertook the following steps to generate its center-of-population shapefiles:

  1. Convert the Census Bureau's latitude/longitude coordinates to GIS features.
  2. Modify the feature attribute fields for consistency with other NHGIS files.
  3. Apply an equal-area conic projection for consistency with other NHGIS files.
  4. Attach metadata describing the file contents.

Place points

See the place points documentation page for complete information on the derivation of NHGIS place point files.

Back to Top

Realigned boundaries based on 2008 TIGER/Line files

Because the Census Bureau made major accuracy improvements to TIGER/Line features between the 2000 and 2008 TIGER/Line releases, the original NHGIS shapefiles based on 2000 TIGER/Line features are not comparable with newer TIGER/Line data. We therefore generated new 2008-based boundary files by systematically realigning the boundaries for tracts and counties to fit with 2008 TIGER/Line features, a process referred to as conflation. These conflated NHGIS boundary files are identified as "2008 TIGER/Line +" in the Basis column found in the Select Data grid.

The Census Bureau made additional improvements to TIGER/Line features after 2008, so the 2008 TIGER/Line-based files are not consistently comparable with 2010 and later TIGER/Line files. In general, most 2008-based boundaries align better than 2000-based boundaries with 2010 and later TIGER/Line files, but the 2008-based boundaries also include occasional gross inaccuracies.

For users who have no need to compare historical boundaries with boundaries from 2010 or later, we recommend using the original 2000-based NHGIS boundary files.

For users who do wish to compare or overlay historical boundaries with boundaries from 2010 or later, we recommend downloading and examining both the 2000- and 2008-based versions of historical boundaries in order to determine which is more suitable for your study area and analysis.

For users who wish to overlay 2000 boundaries with boundaries from 2010 or later, we recommend using the 2000 boundaries derived from 2010 TIGER/Line.

Back to Top

Census Bureau TIGER/Line Technical Documentation

Back to Top