On the Application of Heckman?s Sample Selection Model to Travel Survey Data: Some Practical Guidelines

C Vance, S Buchheim, DLR, German Aerospace Centre, DE


This paper analyzes the determinants of motor vehicle use using econometric techniques on panel data from Germany. Using the Heckit model, we simultaneously address the discrete choice of car use and the continuous choice of distance traveled.


The determinants of motor vehicle use are significant to a range of themes that have relevance for the study of mobility behavior. Private cars not only contribute to air and noise pollution, but are also major sources of congestion, injuries, and fatalities on the public roadways. The behaviors that give rise to these negative external effects emerge largely from decision-making undertaken at the household level, including choices pertaining to the allocation of both household resources and responsibilities among individual members. These choices, in turn, give rise to in-home and out-of -home activity patterns, from which the demand for travel by various modes is derived. In Germany, as elsewhere in the industrialized world, the demand for motor vehicle travel is of particular interest because of its strong growth in recent years, with the number of personal vehicles increasing by 15.2% between 1995 and 2001 (Kraftfahrt-Bundesamt, 2002). Understanding the preferences and constraints that determine motorization can be useful to several policy applications, including assessments of the provision of public transport infrastructure, forecasting of trends in air pollution, and the evaluation of zoning and other land use measures.

The present paper analyzes the determinants of individual motor vehicle use by estimating an econometric model on a panel of travel-diary data collected in Germany between the years 1994 and 2001 (http://mobilitaetspanel.ifv.uni-karlsruhe.de/ ). The data set is augmented by various measures of urban form (e.g. street density), which were calculated in a GIS on the basis of postal zip code boundaries. A central goal of the research is to explore the interactions between the urban form measures and the attributes of individual household members, including gender, employment status, and activity pattern, as determinants of private car use.

Unlike the majority of studies to date, we address the issue of vehicle access from two angles pertaining to the discrete choice of car use and the continuous choice of distance traveled. These distinct yet interrelated decisions are analyzed by employing a Heckit model that partitions individuals according to their use of the car while simultaneously controlling for biases emerging from sample selectivity. The model comprises two stages. As roughly 40% of the observations do not use the car at all on a given day, stage 1 estimates a probit model on the determinants of the discrete choice of vehicle use over the period of a day. In stage 2, a weighted least squares (WLS) model is estimated on the determinants of distance traveled only for those individuals who traveled some positive distance. To control for sample selectivity biases, the inverse Mills ratio calculated from stage 1 is included as an explanatory variable in the second stage estimation. Both the probit and WLS models are estimated using a population-averaged panel specification that controls for the within-person correlation of observations.

In applying the model, we highlight several conceptual and methodological complications that arise in the context of sample selectivity problems. Specifically, care is warranted with respect to:
· the formulation of the research question, and whether interest centers on the potential or actual outcome
· the assessment of the extent to which collinearity problems could undermine the results, whereby the condition number provides a useful diagnostic tool
· the interpretation of interaction coefficients in the first stage Probit model
· the interpretation of coefficients in the second stage WLS of variables that also appear in the first stage Probit

Preliminary results suggest strong effects of gender for both the discrete choice of car use and the continuous choice of distance traveled, whereby women have a lower probability of using the car and travel shorter distances with the car than men. Interestingly, an interaction term reveals that employed women travel shorter distances with the car than unemployed women. The urban form variables are also identified to be statistically significant, but no distinctions in the effects of these variables are found between women and men. For policy purposes, we discuss how future research needs to disentangle potential simultaneity in decisions regarding mode choice and residential location. To this end, more detailed data is not only needed on urban form, but also on household residential histories.


