A Rigorous Framework for Empirically Comparing Mixed GEV and Mixed MNL with Error Components Models

A Rigorous Framework for Empirically Comparing Mixed GEV and Mixed MNL with Error Components Models


L A Garrow, T Bodea, Georgia Institute of Technology, US


We compare two approaches for incorporating correlation among alternatives in discrete choice models. Experimental design techniques are used to develop a formal framework for assessing the statistical significance and relevance of different factors.


Logit models are often used in transportation to predict travelers? choice of destinations, mode, route, etc. There are several limitations of the most commonly used models, namely the multinomial logit (MNL), nested logit (NL), and other models that belong to the generalize extreme value (GEV) class. In particular, these models are not able to incorporate random taste variation and impose restricted substitution patterns among alternatives. In recent years, there has been increasing interest in using mixed logit models because they offer a high degree of modeling flexibility. Mixed logits are conceptually attractive because they enable random taste variation by allowing parameters of the utility function to vary across individuals. In addition, they have been shown to theoretically approximate any random utility model via the inclusion of an appropriate set of error components which induce correlation among alternatives that share common unobserved characteristics.

Within a mixed logit framework, there are two approaches for incorporating correlation among alternatives. The first approach uses a mixed MNL model and allows the parameters of the utility function to vary across alternatives in such a way that analogs to GEV models, such as the NL model, are created. The attributes used to create correlation and/or heterogeneity are called error components. Thus, the mixed MNL contains parameters that vary across individuals (and create random taste variation) and parameters that vary across alternatives (and create correlation and/or heterogeneity among alternatives). The second approach uses a more complicated GEV model, such as the NL model, to represent correlation among alternatives. In the mixed NL model, random taste variation is incorporated by allowing parameters of the utility function to vary across individuals which may introduce correlation among alternatives in addition to the correlation created by the underlying NL model. The benefit of this approach is that is it has fewer dimensions of integration so should require less computational time. The disadvantage is that the researcher needs to program a more complicated log-likelihood function.

To date, there has been several papers on this subject. On the theoretical front, Garrow and Koppelman (2004) show correlation among alternatives is not uniquely identified for all NL mixture models (or models that use the core MNL probability function and homogeneous error components to replicate the NL correlation structure). They also propose an alternative formulation of a homogeneous error structure that requires fewer error components. On the experimental front, Hess, Bierlaire, and Polak (2005) and Gopinath, Schofield, Walker and Ben-Akiva (2005) have empirically compared mixed GEV and with mixed MNL models that include error components. In general, prior experimental work is characterized by (1) limited testing on a few datasets or, if using synthetic data, on a single replicate (i.e., only a single dataset generated from the same underlying distributions) and (2) qualitative assessment of results in terms of measure of fit, substitution effects, and prediction.

The key objective of this paper is to extend previous work by theoretically and empirically comparing these two approaches. Theoretically, there are three main contributions of this paper. First, we explicitly discuss how synthetic data should be generated to create a ?true? dataset and highlight the importance of using multiple datasets or replicates. Second, we describe how different characteristics of a discrete model (such as number of nesting levels, choice frequencies, etc.) can impact empirical precision and results. Finally, we use experimental design techniques to provide a rigorous framework for formally (versus qualitatively) assessing the statistical significance of empirical results and evaluation metrics.

Preliminary computational results indicate a lack of empirical identification for mixed MNL models in which logsum coefficients differ by approximately 0.1 units. This study explores the sensitivity of empirical identification in mixed MNL models to different factors, including number of nesting levels, number of alternatives, choice frequency of alternatives, amount of correlation among nests, and homogeneous versus heterogeneous error assumptions. From a policy perspective, it is important to understand what factors lead to empirical identification problems as the inability to replicate correlation among alternatives in a mixture analog can lead to different forecasts and different interpretations about substitution patterns among alternatives. The results of the empirical analysis provide insight into interpretation problems that can be caused by numerical error in mixture analogs and provide insights into the sensitivity of forecasts of these two modeling approaches.


Garrow, L.A. & F.S. Koppelman (2004). Comparison of choice models representing correlation and random taste variation. Working Paper. Department of Civil and Environmental Engineering, Northwestern University.

Gopinath, D., M.L. Schofield, J.L. Walker & M. Ben-Akiva (2005). Comparative analysis of discrete choice models with flexible substitution patterns. Paper presented at the 84th Transportation Research Board Meeting, Washington, D.C.

Hess, S., M. Bierlaire, and J.W. Polak (2005). Capturing taste heterogeneity and correlation structure with mixed GEV models. Paper presented at the 84th Transportation Research Board Meeting, Washington, D.C.


Association for European Transport