Why don’t the data include FIPS codes?
What do the field names (e.g., AG3001, B5P001) in an NHGIS data file mean?
Do I need special statistical software like SPSS, SAS, or Stata to use NHGIS data?
Something is wrong with the data I downloaded. What should I do?
Can I just get a copy of every data table you have and save myself the hassle of repeatedly downloading different tables?
Does NHGIS have data for outside the USA?
Does NHGIS have data for Puerto Rico, the US Virgin Islands, or other territories?
Can I use NHGIS for genealogy?
I downloaded some GIS boundary files, but where is the census data?
OK, I downloaded the data tables and the GIS boundary files. Now how do I join them together?
Why am I missing columns of data when I bring the .csv into a GIS?
What projected coordinate system is the GIS boundary data in?
What is the unit of the SHAPE_AREA and SHAPE_LEN?
How long does a data extract take?
Why can’t I download data at the census tract level from before 1910?
Why do I receive so many data files in my downloaded zipped file?
I have a suggestion for how you can improve your website! Do you want to hear it?
Should I cite NHGIS in my paper? How?
Data Questions [top]
How do I obtain data? [top]
All NHGIS data are delivered through our data extraction system. Users select the data tables and GIS boundary files they would like, and the system creates a custom-made extract containing this information. Data are generated on our server, and the system sends out an email message to the user when the extract is completed. The user must download the extract and analyze it on their local machine. Users need to register for a free account before they can submit an extract request. Users do not, however, need to register or login prior to building an extract request. Detailed information on using the data extraction system is provided on the User’s Guide page.
Why don’t the data include FIPS codes? [top]
FIPS codes did not exist prior to the 1970 Census, so they could not be provided for older census data tables. The NHGIS instead provides custom codes, which generally correspond to FIPS codes, with an added zero, for all areas that persisted beyond 1970. For recent censuses, NHGIS shapefiles include concatenated NHGIS codes but not the FIPS codes in order to minimize file size. FIPS codes do exist within all NHGIS data tables for 1970 and later, but the codes are not concatenated as you would typically find in data downloaded from American FactFinder.
What do the field names (e.g., AG3001, B5P001) in an NHGIS data file mean? [top]
The key to these unique, NHGIS-created column names is found in the Codebook file(s) that were automatically included in your data extract. Look for the .txt file(s) in the zipped file you downloaded, and they will shed some light on your data.
Do I need special statistical software like SPSS, SAS, or Stata to use NHGIS data? [top]
Absolutely not! Any spreadsheet software like Microsoft Excel will work fine, and even that technically is not required. Statistical software may make it easier to analyze large amounts of data, and NHGIS does provide an output file format specifically for the three major statistical software packages, complete with a command file for each.
Something is wrong with the data I downloaded. What should I do? [top]
There are lots of reasons why something may seem wrong with your data. Typically (but not always), the issue stems from trying to use the incorrect data format for your software or by not being aware of the many, many odd quirks that exist in older census data. Review the information on the User’s Guide and Data Documentation pages for additional information. Of course, always feel free to contact NHGIS User Support at firstname.lastname@example.org with any questions you may have!
Can I just get a copy of every data table you have and save myself the hassle of repeatedly downloading different tables? [top]
You might not realize how big the NHGIS really is. (It's into the terabytes!) Users frequently ask for this, thinking it will save them time to have every file we have; trust us, it won’t. Sifting through over eighteen thousand tables and hundreds of thousands of fields is not an easy task. Data-hungry researchers are free to email NHGIS to request large quantities of data delivered outside of the website data extraction system. Honoring said requests, however, is at the discretion of NHGIS staff.
Does NHGIS have data for outside the USA? [top]
NHGIS does not have international data. Two other projects at the Minnesota Project Center, IPUMS-International and NAPP provide census microdata (not aggregate data) for other countries.
Does NHGIS have data for Puerto Rico, the US Virgin Islands, or other territories? [top]
We do not have any data for these areas for the years 1790-2000. The 2010 Census and American Community Survey (ACS) do include Puerto Rico data.
Can I use NHGIS for genealogy? [top]
Sorry! NHGIS data are aggregate data, which means we have no information on specific people. There are no names or addresses on anything we have. Genealogists, unfortunately, will have to look elsewhere for this type of data. Resources such as www.ancestry.com or www.familysearch.org may be useful websites to visit.
What are time series tables? [top]
NHGIS time series tables link together comparable statistics from multiple censuses in one table with standardized labels and codes for all years. Users can view available time series tables and select and download them in various layouts through the Data Finder. Complete information on the derivation, layout and coverage of time series tables is available here.
GIS Questions [top]
I don’t have Esri ArcGIS! Can I still map NHGIS data? [top]
There are several options for those who do not have access to Esri ArcGIS. A number of open source GIS programs are available, including GRASS and QGIS. In addition, the Social Explorer website allows online mapping of select NHGIS data. Another option for many students is a free student ArcGIS license, which can often be obtained through college GIS or Geography departments for class purposes, or by purchasing select books from the Esri Press. Student licenses vary in length, but are typically 6 or 12 months.
I downloaded some GIS boundary files, but where is the census data? [top]
GIS boundary files, on their own, do not contain any census data, even if you downloaded the data tables at the same time. To attach tabular NHGIS data files to the GIS boundary files requires a join operation in your GIS software. Additional information on using a GIS with NHGIS data can be found on the User’s Guide page.
OK, I downloaded the data tables and the GIS boundary files. Now how do I join them together? [top]
NHGIS has made it as easy as possible to join data tables to their respective GIS boundary files. In both files, you will find a field called GISJOIN that will serve as the join field. Additional information on using a GIS with NHGIS data can be found on the User’s Guide page.
Why am I missing columns of data when I bring the .csv into a GIS? [top]
Unfortunately, when adding a .csv file into older versions of Esri ArcGIS the maximum number of columns that ArcGIS will import is 255, and any additional fields are truncated. This is a known issue to Esri and Microsoft and is outside the control of NHGIS. The latest version of ArcGIS, 10.1, however, has resolved this issue. Users of older versions of ArcGIS may try using the Quick Import tool that is part of the Data Interoperability extension to ArcGIS as a workaround. It is not available, however, for all ArcGIS users. Other solutions do exist and additional information on the issue, along with instructions on using the Quick Import tool, can be found on the User’s Guide page. In addition, be advised that older versions of Microsoft Excel (pre-2007) have the same 255-column limitation.
What projected coordinate system are the GIS boundary data in? [top]
NHGIS shapefiles use Esri's USA Contiguous Albers Equal Area Conic projection. Prior to May 2013, NHGIS provided separate files for Alaska in the Alaska Albers Equal Area Conic projection, for Hawaii in the Hawaii Albers Equal Area Conic, and for Puerto Rico in an Albers Equal Area Conic projection with central meridian, standard parallels and latitude of origin set to match the Puerto Rico State Plane Coordinate System's.
What is the unit of the SHAPE_AREA and SHAPE_LEN? [top]
SHAPE_AREA is an area measurement in square meters. SHAPE_LEN is a perimeter measurement in meters.
Why isn’t the GISJOIN the same from one decade to the next for a census tract that doesn’t seem to change? [top]
The numbering of census tracts (and other lower levels of geography like census blocks) is determined systematically for entire counties by a combination of local and Census Bureau authorities. In the early years of tract definitions, the numbering systems occasionally underwent dramatic revisions, but even in recent censuses, when numbering systems have been more stable, tract boundary changes in one part of a county can force a renumbering to tracts throughout the county to accommodate a more logical numbering.
Recent releases of the Census TIGER/Line Shapefiles have included numerous improvements in spatial accuracy (from a greater use of GPS and localized data sources). This has resulted in numerous misalignments between the new data and older, less accurate NHGIS boundaries. To address this issue, NHGIS staff conducted a systematic realignment of our historical shapefile boundaries to accord with newer, 2008-based TIGER/Line data.
The Census Bureau, however, subsequently made additional improvements to TIGER/Line features, so the 2008-based files NHGIS created are not consistently comparable with 2010 and later TIGER/Line files. In general, most 2008-based boundaries align better than 2000-based boundaries with more recent TIGER/Line files, but the 2008-based boundaries also include occasional gross inaccuracies.
The realignment project, as mentioned in the previous question, resulted in new boundaries being created for all tract and county shapefiles 1790-2000 based on updated 2008 TIGER/Line boundaries. Because the original and new shapefiles each have instances where they are better suited, both are available for download. Users may wish to download both the 2000- and 2008-based versions of historical boundaries in order to determine which is more suitable for his or her study area and analysis.
The original shapefiles based, in part, on the 2000 TIGER/Line boundaries are typically a better choice when mapping only pre-2010 data as all boundaries should better align. If wanting to display the 1970 census tracts inside the SMSA boundaries, for example, the 2008-based census tract may not align with the 2000-based SMSA file.
The new, 2008-based shapefiles align more closely (but not completely) with 2010 and later shapefiles and are typically a better choice when creating an overlay that includes both historical and 2010 or later shapefiles. The 2008-based census tracts for 1990 and 2000, for example, will typically align more closely with 2010-based census tracts.
Census Questions [top]
Do you have all of the American Community Survey datasets? [top]
Currently, NHGIS includes the 2010, 2011 and 2012 ACS datasets. We aim to add data from each new ACS release within six weeks of the Census Bureau release date. We will also continually be adding older ACS datasets, beginning with 2009 datasets. Look for updates on data additions on the NHGIS News page.
What is wrong with the 1960 data? [top]
Sadly, such a common question! The 1960 Census employed a uniquely restrictive data suppression strategy that leaves many data tables with lots of missing data. In addition, the NHGIS can offer only a small set of data tables for states and counties due to a scarcity of digital 1960 Census data, and the 1960 tract data come from 2 separate sources, resulting in some inconsistent redundancy. Review the Tabular Data Documentation page for more detailed information.
The Minnesota Population Center in collaboration with the Census Bureau is currently working to recover lost data from the 1960 Census of Population and Housing. Once completed in 2013, new 1960 summary files will be available along with additional microdata products.
Data Downloading Questions [top]
Why can’t I select to download data for a single county or a single MSA? [top]
Rather than specifying a specific geographic location, users only select the geographic level of interest. For example, if interested in data for Hennepin County, Minnesota, you would simply select “County” as your geographic level. Then, after downloading the county-level data for the entire United States, users can easily extract the specific locations of interest.
NHGIS provides data in this format primarily in the interest of standardizing the selection interface for all years and geographic levels. Giving users the ability to select data from multiple years and data from previously unavailable "compound" geographic levels (see the next FAQ) makes it difficult to support selection by specific geographic extent. Data from different years or from compound geographic levels do not consistently nest within a static set of selectable areas. In addition, over the years NHGIS has discovered that users typically like more data, rather than less! Finally, removing the need to select a geographic extent simply shortens the time it takes users to create an extract.
NHGIS provides access to compound geographic levels. When you click to "show all geographic levels" on the Filter page, you see the standard geographic levels like Census Tract or County, but for each of these, there are numerous compound levels also available with labels describing exactly how the compound level is subdivided.
Standard geographic levels nest consistently within a hierarchy of larger units. For example, census tracts nest within counties, which in turn nest within states. This is a standard geographic level because census tracts cannot be split by county boundaries nor can counties be split by state boundaries. If you download data at this geographic level, a single record for each census tract is returned.
The non-standard or "compound" geographic levels consist of intersections between non-nesting standard levels. For example, you can now download census tracts in a State>Place>County hierarchy. Because census tracts do not nest within places, this geographic level will provide separate records for the portions of census tracts contained within different places, and a census tract's code will appear in multiple records if multiple places fall within its boundary.
How long does a data extract take? [top]
The time needed to make an extract differs depending on the size of the data extract requested and the load on our server. Extracts can take from a few seconds to an hour or more. The system sends an email when the extract is completed, so there is no need to stay active on the NHGIS site while the extract is being made. If users wish, however, they can stay on the Extracts History page following an extract request. Refreshing the web browser will allow the user to see progress being made on the extract request.
Why can’t I download data at the census tract level from before 1910? [top]
They did not exist back then! NHGIS only has data tables and GIS shapefiles for areas defined by the US Census Bureau. In addition, it is important to remember that not all levels of census enumeration have shapefiles associated with them.
For example, while data tables can be downloaded at the “Congressional District (1983-1985, 98th Congress)” level, NHGIS does not have a corresponding GIS shapefile.
Why do I receive so many data files in my downloaded zipped file? [top]
A tabular data file is returned for each dataset, geographic level, and year for which tables are selected. If you only downloaded county level data for one year and from one dataset, you will only receive one data file. If your data extract request was more expansive, however, you can expect more data files. For example, selecting data from 1940 and 1950 at the state and county geographic levels, you would receive four data files (1940 state, 1940 county, 1950 state, 1950 county).
NHGIS and General Questions [top]
What are aggregate data? [top]
Aggregate data summarize a set of individuals through counts, sums, means or other aggregate statistics. The NHGIS specifically provides spatially aggregated census data: data summarizing individuals within particular areas, such as states or counties, where the "individuals" might be persons, housing units, farms, libraries, newspapers or any other features that were at some point counted in a U.S. census. No individual-level records, with or without personally identifiable information, are included anywhere in the NHGIS. So if you are trying to find your great-great-great grandfather who homesteaded in the late 1800s, you will not have any luck with NHGIS data.
What is the source of the data? [top]
The original source of most of the aggregate data, with a few exceptions, is the US Census. For censuses since 1970, the NHGIS obtained digital data directly from the Census Bureau. For earlier censuses, NHGIS data are generally derived from secondary sources: separate projects which have, over the course of several decades, undertaken the arduous conversion of pre-computer-age historical data from print media to a digital, machine-readable form. Most of this work was completed by Michael Haines, Donald Bogue, Andrew Beveridge and their respective research teams. Documentation for most sources is available on the Tabular Data Documentation page.
The GIS boundary files are based on the TIGER/Line data that the US Census Bureau creates, which NHGIS staff edited to produce all pre-1990 boundaries. The primary guide for historical census tract boundaries was original census maps and, for early county boundaries, it was the book, Map Guide to the U.S. Federal Censuses 1790-1920, by William Thorndale and William Dollarhide (Genealogical Publishing Co., Baltimore, MD, c. 1987).
What data is NHGIS working on right now? [top]
Currently, we are preparing to release the 2009 and prior ACS datasets, as well as new time series tables. In addition, we are in the early stages of creating GIS boundary files for historical places.
How do I get a job working with NHGIS? [top]
We’re flattered you're interested in us! NHGIS is a part of the Minnesota Population Center, which is itself part of the University of Minnesota on the Twin Cities campus. Please visit the employment pages at MPC and at the U of M for up to date job postings for both students and professionals.
I have a suggestion for how you can improve your website! Do you want to hear it? [top]
Sure! We are always open to new suggestions. This does not mean we can always act on the suggestion, but many changes to NHGIS have come about through users’ suggestions. You may direct yours to email@example.com.
Should I cite NHGIS in my paper? How? [top]
Reports and publications using NHGIS data must be cited appropriately. The citation is:
Minnesota Population Center. National Historical Geographic Information System: Version 2.0. Minneapolis, MN: University of Minnesota 2011.
Also, it is extremely important for us that you send us a citation or link for any paper or article you write using NHGIS data. Continued funding for the NHGIS depends on our ability to show our sponsor agencies that researchers are using the data for productive purposes.