Random Sampling of Alternatives in a Route Choice Context

Random Sampling of Alternatives in a Route Choice Context

WINNER OF The Neil Mansfield Award


E Frejinger, EPFL, CH


This paper presents random sampling approach for the definition of choice sets in a route choice context. Moreover, we present a systematic comparison of existing path enumeration algorithms using different networks.


Random utility models are the most widely used models for analysing and predicting route choice behaviour. In this context, a choice set for each origin-destination pair needs to be defined. The number of elementary paths connecting an origin to its destination can be very large. Assuming that a traveller chooses between all feasible paths is therefore not only behaviourally unrealistic, but can also cause operational problems in estimating and applying a route choice model. This motivates the need for using path enumeration algorithms for defining choice sets of limited size. Ideally, the choice set should contain all paths that a traveller might consider but no unreasonable paths. Moreover, only paths that are perceived as different by the travellers should be included. Indeed, in an urban network paths can have such a high degree of overlap that they may not be perceived as different by the travellers.

In this paper we propose a random sampling approach for defining choice sets. It is motivated by the fact that Multinomial Logit and GEV models can be consistently estimated with a subset of randomly selected alternatives (Ben-Akiva and Lerman, 1985, and Bierlaire et al., 2006). In addition to a description of the proposed approach, we provide a systematic comparison of existing path enumeration algorithms on different networks. In particular with two stochastic approaches that have been proposed in the literature by Bovy and Fiorenzo-Catalano (2006) and Ramming (2001). For this purpose we have developed performance measures that are dependent and independent of the observed path choices.

The approach proposed here is based on subpaths where a subpath is a sequence of links. The originality of this approach is that the inclusion of a subpath in a choice set is modelled in a stochastic way based on the distance to the shortest path. More precisely, the probability associated with a subpath is defined by the double bounded Kumaraswamy distribution and the probabilities assigned to the subpaths can be controlled by the definition of the distribution parameters. This is a very flexible approach that can be used in various path enumeration algorithms (including those described in the literature). In this paper, we propose two new approaches.

First, we propose a biased random walk algorithm. Starting from the origin, a subpath is selected using the probability distribution described above. Another subpath starting at the sink node of the first one is then generated in the same way, such that the combination of the two does not form a loop. This process is applied until the destination is reached, and a complete path has been generated. In the extreme case where the subpath is composed of one link, and the selection of the next link is based on a uniform distribution, we have a pure random walk. The algorithm biases the random walk towards the shortest path, in a way controlled by the parameters of the distribution.

Second, we propose a forced passage algorithm, where a subpath is selected anywhere in the network, using the probability distribution described above. We generate a path composed of three segments: the shortest path from the origin to the source node of the subpath, the subpath itself, and the shortest path from the sink node of the subpath to the destination.

Each of these approaches has advantages and drawbacks that will be discussed. The final algorithm which is currently under development will combine both approaches. The algorithms have been implemented in BioRoute which is a route choice modelling tool for BIOGEME (Bierlaire, 2005). They have up to date been tested on a simplified Swiss network containing 39411 links and 14841 nodes. This is to our knowledge the largest network used for testing choice set generation algorithms. The results are very promising for both the biased random walk and the forced passage approaches. For the Swiss dataset, 50% of the observations were found by the link-elimination algorithm, while approximately 80% were found by our algorithms.

This is an ongoing research project and we are currently performing further tests and comparisons with other algorithms. For these tests we are also using other datasets and networks. Namely, a GPS dataset collected in the Swedish city of Borlänge (Frejinger and Bierlaire, 2007) and the Boston dataset also used by Ramming (2001).


Association for European Transport