New Data Sources and Data Fusion


Anan Allos, Atkins, Andrew Merrall, Atkins, Roger Himlin, Highways Agency


Research into the efficacy of new data sources such as mobile phone and GPS to enhance prior matrices, after combining with conventional RSI data in a statistically robust process using data fusion.


Origin Destination (OD) information is key to building models, and Roadside Interview survey (RSI) has been the method of choice to collect this intercept data for highway ODs. However, RSIs are becoming increasingly difficult to undertake in a large part due to the congestion and disruption they cause. On most A roads and all motorways, it is not possible to carry out RSIs although these roads carry the more important movements.
Because of this, the focus has recently shifted to the use of Big Data [such as that from mobile phones, GPS, blue tooth, etc.]. Although these ‘passive’ data sources suffer from many shortcomings, they are proving valuable in the large sample, consistency and coverage they provide.
The primary objective of the research carried out for the HA is to assess the efficacy of GPS data [from TrafficMaster-TM] and mobile data [from INRIX] in enhancing a model’s ‘prior matrix’. The work started with examining a number of strategic models and selecting the most suitable using a number of criteria. TM and INRIX data was obtained, and the paper will detail the nuances, challenges, and pros and cons of each dataset. The datasets were segregated by mode and vehicle type. The data was adjusted after comparing to independent data.
Each dataset was separately combined with partial data from RSIs, through a novel data fusion technique. The fused matrices will then be separately combined with the synthetic model matrices to form the prior matrices, in preparation for matrix estimation (ME). The new prior and post ME matrices were compared to the post ME matrix derived using conventional data alone, to assess whether the new data source reduced the need for ME or not and whether the final post ME matrix represented an improvement in terms of validation against independent data. The data source which performed better was thereby identified.
If the use of new data sources and the method of data fusion is found to be successful, the intention is that guidance will be written so that the technique can be adopted


