An NHGIS time series table links together comparable statistics from multiple U.S. censuses in one downloadable bundle. A table is comprised of one or more related time series, each of which describes a single aggregate statistic measured at multiple times. The set of characteristics, years, and geographic levels covered varies from table to table.
For example, NHGIS provides two Persons by Sex time series tables. Both tables contain two time series: Persons: Male and Persons: Female. The two tables differ in the geographic levels and years they cover. The first provides state and county data for all censuses back to 1820. The second provides data for states, counties, county subdivisions, census tracts, and places for censuses back to 1970.
To define time series tables, NHGIS researchers create metadata specifying sets of comparable statistics from various source datasets. When a user requests a time series table, the NHGIS system uses the metadata to piece together data from the source files where the requested statistics are located.
In many instances, generating a single time series (e.g., Persons: Under 5 years) requires aggregating multiple source statistics (e.g., summing Males under age 5 and Females under age 5) to produce a comparable statistic across all years. NHGIS researchers specify any needed operations in the metadata, and the extract system completes the computations in advance… saving NHGIS users from another big data processing hassle!
Time series tables are designed to encompass as many different statistics as possible for a predetermined set of topics, years, and geographic levels:
- Topic coverage: Researchers define tables for one tabulation type at a time, where a “tabulation type” is a unique combination of measured feature, aggregation method, and classification dimensions. Example tabulation types include: Persons by Sex by Age, Persons by Sex by Age by Race, Median Age by Sex, Households by Sex and Age of Householder.
Although these example types each include some combination of the concepts of “Sex” and “Age,” it is necessary to re-examine available statistics and define time series separately for each type because data availability may differ significantly among them. For example, for a given time range and geographic level, it may be possible to create a time series for Males age 85 and above but not for Asian males age 85 and above. NHGIS researchers therefore consider each type separately and produce time series for as many categories as possible in each case.
As new time series tables are released, the complete set will cover a larger and larger set of tabulation types. Initial releases only covered types available in the 2010 Census, i.e., those involving characteristics measured by 100% count statistics such as sex, age, race, Hispanic or Latino origin, household and group quarters type, housing occupancy and tenure, etc. Later releases cover characteristics from American Community Survey data and earlier sample-based datasets, such as education, income, marital status, place of birth, housing features, etc.
To see all topics for which time series are available, open the Topics filter window through the NHGIS Data Finder and look for the "TS" icon, which indicates topics covered by time series tables.
- Year coverage: Initial time series assembly has focused on censuses since 1970 because NHGIS 1970 data cover a much larger range of topics and geographic levels than the 1960 data. We plan to provide more tables spanning longer time ranges in future releases.
The year coverage for individual tables also depends on the ranges of time for which different aggregate statistics are available. For example, the Persons by Race Combination table extends only as far back as 2000 because the 2000 census was the first to tabulate such counts.
- Geographic coverage: Initial releases provide data for eight different geographic levels: Nation, Region, Division, State, County, Census Tract, County Subdivision, and Place. After selecting a table, users may choose which levels to download data for.
The geographic levels available for any table are also restricted according to the availability of statistics for all years covered by the table. For example, because the 1970 census summary files do not provide statistics at the Nation, Region, or Division levels, tables covering 1970 do not provide data for any of these levels. Similarly, tables that use data from 1990 Summary Tape Files 2 or 4 do not provide statistics for the Place level because the Place data in those summary files was restricted to larger Places.
Time series tables can be downloaded in three different layouts:
- Time varies by file: Data for different times are placed in different files. For example, state-level data for a table covering 2000 and 2010 is separated into two files: one file for 2000 and one file for 2010.
- Time varies by column: Data for different times are placed in separate columns within one file. State-level data for a table covering 2000 and 2010 is delivered in one file with one row for each state and two columns for each time series: one column for 2000 and one column for 2010.
- Time varies by row: Data for different times are placed in separate rows within one file. State-level data for a table covering 2000 and 2010 is delivered in one file with two rows for each state: one row for 2000 and one row for 2010.
NOTE: The time varies by column layout currently uses nominal integration, which matches area units across time according to their names and codes only without regard to boundary relationships. Therefore, a single row may provide statistics for distinctly different areas at different times in cases where a matched code refers to distinct areas at different times.
We are working to implement two additional integration strategies to address boundary changes: minimal aggregation, which aggregates units into “least common denominator” super-units that have static boundaries across time; and areal interpolation, which will provide estimates of data at multiple times for the reporting areas of one census. More information on these different approaches will be provided with future releases.
For each time series table, NHGIS provides a complete listing of table contents, coverage, and sources, along with notes describing any known comparability issues and links to relevant source documentation. These details can be accessed by clicking on a table name in the Data Finder. The complete set of all table details is also available here:
The definition, documentation, and dissemination of NHGIS time series tables is a central component of the Integrated Spatio-Temporal Aggregate Data Series (ISTADS) project at the Minnesota Population Center, with funding provided by the Eunice Kennedy Shriver National Institute of Child Health & Human Development at the National Institutes of Health.