Merging Traditional and Big Data Sources to Create a Practical OD Matrix



Merging Traditional and Big Data Sources to Create a Practical OD Matrix

Nominated for The Neil Mansfield Award

Authors

Tim Pollard, Mott MacDonald

Description

For many years, building OD matrices in the UK has focussed on data from surveys.
This paper considers the value of a larger percentage of OD cells filled by Big Data over the very small percentage available from traditional surveys alone.

Abstract

For many years, building demand matrices in the UK for strategic transport models has focussed on data from intercept surveys e.g. Road Side Interview (RSI) data or Car Park Survey data, with synthetic data infilling of gaps.
This presents a number of problems; firstly as a rule of thumb, such sourced matrices cost approximately 10 euros per trip record and capture around 10% of the total passing traffic – expansion factors of 10 are not uncommon. Secondly there is a general concern about bias from the survey techniques (e.g. RSI bias towards longer distance journeys).
In a recent model development project €600k of conventional data only led to around 2% filled OD pairs. This leaves you with a final matrix where the majority of cells are filled in from unobserved (i.e. synthetic) data, leaving you reliant on comparisons of link flows with counts to ensure matrix suitability. As shown in Pollard et al (2013) these link-flow comparisons are often deficient in their ability to differentiate a good matrix from a bad one. This can lead to large margins for error in forecasts produced - Daly et al (2011) showed that a well-filled base year observed matrix is essential in pivot-point models.

Along with the questionable value for money discussed above, an interesting paper by Potter et al (2011) also raises concerns about the dependence on roadside interviews in modern transport models.
New sources of observed data are now becoming available at much less cost (for example, Traffic Master GPS data is free to UK local authorities for matrix building and AirSage in the USA sells trip matrices based on cell phone data), but these often come with much less information per record than a traditional survey, and some concerns about bias. What place do these emerging sources have alongside traditional ones?

This paper reviews methods deployed to combine these new, Big Data sources in a recent development of PRISM strategic transport model of the West Midlands, UK. This paper will look at what data-cleaning and bias-correction techniques were required for the Big Data and how the merging of the data was achieved whilst preserving those trips from direct observations.

This paper considers the value of having a larger percentage of OD cells filled by Big Data over the very small percentage available from traditional surveys alone.

Publisher

Association for European Transport