National Data Sets, How to Choose Them, How to Use Them

Penelope Z. Weinberger, American Association of State Highway and Transportation Officials


Compares, and gives bases to compare, large complex national survey data sets.


The US population is around the 323 million mark today (and grows by 4 a minute). Transportation planners who serve this enormous population are dependent on only two national surveys for the bulk of travel data that informs infrastructure, planning and decision making. The US Census Bureau’s ongoing American Community Survey (ACS), from which the Census Transportation Planning Products (CTPP) data is derived, and the USDOT’s National Household Travel Survey (NHTS). The ACS asks a series of commute based questions that give a robust picture of the trip to work for the nation. The ACS is based on a sample of about 8 percent of households over five years. The CTPP commissions a customized tabulation of ACS data tailored for transportation planning applications. This data set is used for travel model validation and calibration; an input to the long range plans required of the 408 metropolitan areas and 50 states as a condition of Federal aid. Other uses of the CTPP range from generating demographic profiles to corridor planning to Environmental Justice analysis to trend analysis, including national commute trends detailed in Commuting in America.
The smaller and more infrequent but more in depth NHTS is a diary based survey that has been collected seven times between 1969 and 2009 with an eighth iteration scheduled for 2016. In 2001, the NHTS collected all household travel from a national core of 26,000 households, with regional add-ons adding 43,000 more households constituting about 0.06% of total US households. In 2009, the core sample remained the same, but the national add-ons totaled nearly 125,000 households, bring the sample to about 0.13% of the total.
The CTPP is better suited to assess phenomena in small geographies, while the NHTS captures all trips, not just commute trips. Both data sets are used to inform transportation policy and assess how previous transportation investments have performed. This paper is a comparative analysis of the two data sets, it discusses their methodology and statistical nuances, subjects covered, weaknesses, strengths, and uses. This work is aimed to inform practitioners dealing with large and varied data sets and give some insight into how to evaluate, how to assess, and what types of analyses are appropriate for given data.


