Further Developments in OD Data Fusion Methodologies

Further Developments in OD Data Fusion Methodologies


G Skrobanski, Highways Agency; M Logie, Minnerva; I Black, Independent Consultant; Y Dong, C Gilliam, Hyder Consulting (UK) Ltd.; J Fearon, John Fearon Consultancy, UK


The OD data fusion methods reported at ETC in 2010 have been further developed. Greater mathematical rigour has been provided and the methodology has been demonstrated on a large scale application, relating to the M25 London orbital motorway.


This paper reports on further stages in the development of OD data fusion methods reported at ETC in 2010. As previously, the work was undertaken for the UK Highways Agency by a team led by Hyder Consulting (UK) Ltd.

The further project gave the opportunity to take forward the prior work in terms of providing more mathematical rigour while also demonstrating the methodology on a large scale application, namely for the M25 area of London.

In this context, "OD data fusion" can be understood as a form of trip matrix estimation using observed data, namely traffic count data, to update an existing, "prior" trip matrix.
Trip matrix estimation is widely used but its results are not always accepted as valid and some guidance suggests that it should only be used to a limited degree. At the same time, widespread instrumentation of the highway network means that very large volumes of traffic count data are routinely collected, and other forms of data on speeds and travel patterns are enabled by GPS and ANPR systems. The OD fusion project confined itself to count data, but remained mindful of these other forms so that its approach could reasonably be extended to fuse these other potential data sources.

The OD Data Fusion project sought to address these points in several ways, including:
- Exploring the qualities of different matrix estimation formulations encountered in the literature
- Providing a firm statistical formulation for a revised method ("OD fusion")
- Confirming the characteristics of the OD fusion method on conveniently sized datasets
- Confirming the practicality of the method with a large application relevant to the Highways Agency
- Understanding and modelling traffic count variability.

The matrix estimation/OD fusion problem is specified in the form of an optimisation problem, which earlier stages of OD fusion work used numerical simulation methods to solve. This continued in this project as the flexibility of simulation readily allows different forms to be examined. However, the mathematical analysis provided a (Weighted Least Squares) analytic formulation based on standard statistical foundations. An implementation of the analytic formulation was compared with an equivalent implementation using simulation, as well as with varied other formulations (Maximum Likelihood, etc).

The analytic formulation was shown to be more robust and, generally, to have better results (according to reasonable criteria) than the simulation methods. The analytic formulation provides direct estimates of the covariances of the fused OD values that provide a means of gauging accuracy and also allows estimates to be fused with further data when this becomes available.

Additionally, the analytic method was faster to implement and took around 1/10th of the time for the large M25 application.

The project was designed as a research project, but the mathematical results can be implemented in a reasonably straightforward manner using suitable software, such as the MATLAB software used in this case.

Much of the project's time was engaged in collating and preparing data for the M25 area application, which involved practical issues, as well as the issues of "real data"concerning inconsistencies, errors, and omissions. The OD fusion method is concerned with statistical variability and does not directly deal with data biases and inconsistencies. The project therefore demonstrated several procedures for addressing these practical problems that can otherwise undermine the method.

The favoured analytic formulation assumed that data variability could be described adequately using a Normal distribution function. This assumption was examined in a part of the project that analysed extensive automatic traffic count data from sites across the region surrounding and inside the M25. It was found that the nature of the distribution was changed so the Normal distribution did not always provide the best description of variability, but neither did the other distributions that were investigated. In this investigation the project applied various Bayesian modelling techniques to derive ways of characterising count data variability, which was summarised as Coefficients of Variation for different types of roads (motorways, A-roads), in urban and non-urban areas, and in summer and winter seasons.


Association for European Transport