OD Data Fusion

OD Data Fusion


G Skrobanski, Highways Agency, UK; M Logie, Minnerva,UK; I Black, J Fearon, C Gilliam, Hyder Consulting, UK


This paper reports on work undertaken for the UK Highways Agency to develop improved methods for estimating OD trip matrices by fusing matrix data with flow counts and potentially data from other sources, including ANPR and GPS-based instruments.


Methods for estimating origin-destination (OD) trip matrices based on flow counts on transport network links have been an established aspect of transport modelling since the 1980s and are widely applied. This paper reports on work undertaken for the UK Highways Agency by Hyder Consulting and its project partners to improve the estimation procedures. The aim of the work is to maximise the exploitation of data that are becoming increasingly available from instrumented highways, as well ensuring that OD matrices are estimated in the most appropriate way. The study has focused on the 'fusion' of count data with matrix data, but the researched methodology is designed to support a wider range of data sources, including from ANPR and GPS-based instruments.

The project, known as 'OD Data Fusion', has been conducted in two parts; the first of these considered the methodological issues, while the second has been concerned with investigating ways of best addressing the issues highlighted in the first part through statistical analysis and modelling, and by application on two recent Highways Agency network modelling projects, corresponding to 'small' and 'large' scales.

While some theoretical alternatives exist, current matrix estimation methods may be seen as variations on a common theme, which uses an objective function to guide the adjustment of a prior trip matrix to match information on input observed flows. The translation from flows to matrix cell values involves information on vehicle (or passenger) routeings, which is taken from network modelling sources.

The first part of the study demonstrated that the choice of objective function, such as Maximum Likelihood, Maximum Entropy, or Generalised Least Squares, while material, was less significant than the weightings accorded to different sets of data to account for their reliability as a means of indicating the trip matrix for the required situation (e.g. date and time of day). The employment of weightings is considered by some, though not all, matrix estimation procedures in current use. Where weightings are used, it is left to users to provide settings, typically determined by via pragmatic assessments of data quality relative to some benchmark data identified by the user.

The advent of copious data from instrumented sources provides the basis for examining issues of data reliability in a more systematic way. The project established a framework for examining different assumptions about data variability and the comparative improvements provided by additional data.

The variability of data was examined to establish their actual probability distributions, and Bayesian Markov chain Monte Carlo simulation modelling was used to compare their effect with the usual assumption that data variability is adequately described by Poisson or Normal distributions.

Much conventional use of matrix estimation involves estimating matrices from sets of singly-observed manual classified counts (MCC), typically supported by limited amounts of multi-observed automatic traffic count data. The results of the study indicate the extent to which such use of MCC data can or needs to be improved compared to approaches that are statistically more secure.


Association for European Transport