Outlying Sensitivities in Discrete Choice Data: Causes, Consequences and Remedies
D Campbell, Queen's University Belfast, UK; S Hess, ITS, University of Leeds, UK
Unlike other areas of econometric analysis, the sensitivity of outliers is rarely assessed or even explored in stated choice analysis. Irrespective of whether or not outliers are genuine, their presence means that choice models that do not rely on the homogeneity of preferences may be more appropriate. Whilst this heterogeneity can be accommodated by treating the preferences as random and estimating the parameters of their distribution, the presence of outliers may exaggerate the true extent of heterogeneity. This is especially the case when the random parameters are specified with unbounded distributions, such as the commonly used normal and lognormal, which has no upper bound. For this reason, nonparametric methods may provide a better means of representing the unobserved preference heterogeneity.
This paper proposes a finite mixture modelling approach for approximating the mixing distribution, with particular emphasis given to identifying the extreme lower and upper elements of the distribution (i.e., the outliers). In estimation, three mass points are specified for each attribute, the first and third of which are associated with lower and upper outlying parameter estimates respectively, whilst the middle mass point represents the parameter estimate not including the lower and upper outliers. Additionally, we allow for further random heterogeneity within these three segments, essentially leading to a mixture of distributions.
While using a discrete mixture approach runs the risk of confounding between outliers and heterogeneity (e.g., if the true distribution has a high variance the mass points representing the lower and upper limits of the distribution may be more extreme than the true limits), we find that in our empirical application this is not the case and that the approach is relatively insensitive to the potential confounding.
In this paper we are specifically interested in explaining the factors that contribute to respondents having outlying preferences. Within our finite mixture models we therefore include, in a latent class style approach, a series of covariates that help us to better explain whether someone is more likely to fall into the outlier segments. Results from this analysis on various standard stated choice datasets reveal that accounting for (and explaining) very low and high taste intensities leads to significant improvements in model fit and performance and has important consequences for welfare estimation. Additionally, we find that accounting for outliers may indeed lead to significant reductions in the ?residual? random heterogeneity. Our findings reinforce the importance of testing for the presence of outliers and that they should be assessed as a recommended course of action in practice.
Association for European Transport