Estimation of Random Coefficient Models on GPS-data

Estimation of Random Coefficient Models on GPS-data


O A Nielsen, CTT, Technical University of Denmark, DK




The paper presents a method to estimate utility functions in route choice models from GPS-data. Compared to the present use of Stated Preference techniques and other interview techniques to estimate choice models, the work in this paper derives the utility functions from the observed trips and routes. This leads to several methodological challenges, especially since the explanatory variables are highly correlated.

The methodological discussions are tested on data from the AKTA roadpricing experiment in Copenhagen. A total of 500 cars were equipped with GPS over three experiment rounds in a 2-years period. The normal travel pattern for each participant was estimated on observations from a control period over 10 to 12 weeks. A pricing scheme was then implemented over 12 to14 weeks. The corresponding payment for a given road-pricing scheme assuming no change in behaviour could then be calculated. The main idea in the experiment was; That the participants were paid the savings by changing behaviour compared to the control period. The behavioural impact of roadpricing can hereby be revealed.

The GPS-data can be used to evaluate pricing schemes, changes in travel behaviour (time-of-day, choice of route, destination, number of trips, etc), driving pattern (route, speed, etc.) and observation of the road system (speed and congestion). As each car was followed by GPS, the exact routes could be tracked by a map-matching algorithm and the driving speed and congestion levels along the route calculated. The present paper focuses on route choices only.


The processing of GPS-data includes several methodological issues itself, i.e. to cope with periods of lost signals, uncertainty of the received coordinates, as well as problems to match the data to a digital map. However, with a clever set of algorithms the data can be processed to deliver observed routes related to a digital map. Given this, the estimation of utility functions is done the following way:

For each start and endpoint of a trip, different combination of coefficients in the utility function of a route choice model is applied. A fit to the observed route can then be calculated for each of these (from 0 to 100% fit). If the best fit is clear, this is assigned a high weight. However, in cases where several utility functions provide the same fit, each of the fit are assigned lower weights ? or the interval of best fit is kept (i.e. if a route is both the shortest and fastest, then any combinations of coefficients on time and length will provide the same fit).

The choice situation can be explained by the relative weighting of the different attributes. By scaling the utility function, this can reduce the number of combinations of possible coefficients. In addition, different attribute have logical relationships (it is e.g. unlikely that any person would prefer congestion to non-congestion). An approach to reduce the number of necessary all-or-nothing path calculations is derived. This is the core of developing an applicable choice set generation method.

For routes where no clear fit can be obtained; stochastic variation (traditional probit-type of error-term) is added to the utility function to see if the fit can be approved.

This procedure is run for all trips for each person to investigate the internal consistence of the preferences (utility function). By investigating the set of estimated coefficients, the distribution of the coefficients can easily be investigated. This includes the functional form (shape of the distribution) as well as possible correlation between different coefficients (e.g. the coefficients on free flow time and congestion time compared to length and road pricing).

The best overall deterministic utility function can be estimated by a similar approach, but where the fit from each possible combination of coefficients are compared to the entire set of routes for each person.

Finally, variation between persons can be investigated and estimated by comparing the results for all participants.


The method was used on the AKTA GPS-data.
The map matching was done on a detailed digital map, which included 350,000 links.
The links were classified in 25 classes (from motorway, to driveways and gravel road), and included variables concerning length; free flow travel time, congested time in different time-intervals during the day, and road pricing (for those part of the experiment, where the participants faced road-pricing). The utility functions included the same variables.

Each person travelled between 250 and 1,000 trips during the experiment (determined on their travel pattern), and 500 persons participated. They were selected after a factorial design based on income and commuting pattern (between different areas of Copenhagen).

The paper presents the estimated utilities functions, and discusses the intra and inter person variations of the coefficients as well as the size of the error term.

The participants participated in a SP-experiment before the main driving experiment. Since the SP investigated the same choice situation as in the AKTA experiment, it was possible to estimate a logit model based on the SP (reported in a prior paper by Nielsen & Jovicic). This estimation is extended to a mixed logit model estimated using MSL (Maximum Simulated Likelihood) with normal or log.normal distributes coefficients (with or without correlation). The results of this prior estimation are compared with the results from the GPS-data to see if the two approaches provide the same or different results.


Association for European Transport