Overcoming Data Deficiencies Through Advanced Modelling Techniques

Overcoming Data Deficiencies Through Advanced Modelling Techniques


Eric Petersen, Andrew Daly, James Fox, Stephen Miller, RAND Europe, UK


This paper describes a modelling effort for Leicestershire where diverse datasets, each of which is missing important information, were combined in a rigorous fashion to produce reasonable model results.


This paper will discuss the RAND Europe experience in estimating the core travel demand models for the Generic Urban Models Phase 2 (GUM) project, sponsored by the UK Department for Transport. The aim of the study is to construct a model applicable to ?large urban areas?, i.e. non-metropolitan cities with population over 250,000, which will include land-use, travel demand and assignment models. The model is initially based on Leicester but it will then be generalised to the 12 cities of this type in Great Britain.

The primary difficulties in travel demand model estimation in the GUM project stem from deficiencies inherent in the pre-existing household travel survey in Leicester and Leicestershire. This survey was not designed for other travel demand modelling and consequently, the survey covered only a single individual in the household and omitted the travel of anyone under 16 years old. The survey had reasonably detailed information on tour origins but no crisp destination information. Finally, the travel survey lacked information on the time of travel.

RAND Europe overcame these obstacles by bringing in additional available sources of data: an extensive roadside interview (RSI) study and a survey of public transport users (PT) in Leicester. The paper will discuss two major issues involved in bringing the three data sources together. First, the structure of the model is unusual. The household survey contributes information about mode choice and a ?fuzzy? set of destinations, based on the respondent?s self-reported travel distance. The RSI and PT data contributes a crisp destination choice and time of day for the tour legs, but limited socio-economic information. The resulting model structure places mode choice at the top of the hierarchy, with a subset of chosen destinations immediately below, followed by individual destinations and finally time of day alternatives. The second major issue involving this amalgamation of data sources is to ensure that the proper weighting and scaling has been done to bring the three data sources into a common metric, accounting for the survey biases, particularly in the RSI and PT data.

The positive outcome of this modelling effort is that RAND Europe has demonstrated that diverse datasets can be combined to produce reasonable model results, provided that a rigorous approach to combine these data sources is implemented. The methods used are expected to be applicable in other contexts of limited data availability.


Association for European Transport