Specification Testing of Discrete Choice Models: a Note on the Use of a Nonparametric Test
M Fosgerau, Danish Transport Research Institute, DK
This paper describes a nonparametric test procedure which uses a combination of smoothed residual plots and a test statistic able to detect general misspecification in discrete choice models.
It is standard practice in regression models to perform model control using the residuals of the estimated model. Residuals are plotted to verify whether they are in fact white noise unrelated to the independent variables. Residuals are less easily defined in discrete choice models and similar model control is rarely performed for such models.
In fact, a variety of specification tests are available in the literature for some discrete choice models, but are not widely used. Lechner (1991) presents some specification tests for the binary logit model. Gourieroux et al. (1987a) and Gourieroux et al. (1987b) present tests based on generalised residuals for a range of models including the multinomial logit model. McFadden (1987) presents regression-based specification tests for the multinomial logit model similar in nature to the test presented here. The seminal paper by McFadden and Train (2000) provides specification tests of MNL and mixed logit against alternatives with more mixing. Finally, software exists that allows the comparison of observed choices to predictions when data are grouped on a categorical variable. This approach may be viewed as a kind of residual test.
The point of this paper is to describe how the nonparametric test of functional form in Zheng (1996) may be applied to discrete choice models of general form. This means that the test applies also to general models such as the mixed generalised extreme value model. The Zheng test is based on nonparametric kernel regression of the parametric model residuals against the independent variables. So the Zheng test applied to a discrete choice model is based on the comparison of predicted and observed choices.
A general problem in using nonparametric techniques is that the demands on data increase exponentially in the dimension of the space of independent variables, this is the so-called curse of dimensionality. This issue is addressed in this paper by showing how such tests may be applied to a subset of variables or more generally to functions of variables such as the index representing the indirect utility of a choice alternative. It is thus possible to apply nonparametric tests to a model, while regressing only on a low number of variables or even just one. This reduces the demand on data and is of practical importance when the model to be tested has several independent variables.
Furthermore, the test may be applied to variables that are not included in the model. Thus the test may serve as a test of omitted variables.
Association for European Transport