Validation of Stated Preference Forecasting: a Case Study Involving Anglo- Continental Freight
FOWKES T and TWEDDLE G, University of Leeds,
It is notoriously diffieult to 'validate' Stated Preference procedures against real world choices, particularly because so much is liable to change, including the preferences themselves, in the period between the Stated Preference survey being undertaken
It is notoriously diffieult to 'validate' Stated Preference procedures against real world choices, particularly because so much is liable to change, including the preferences themselves, in the period between the Stated Preference survey being undertaken and the implementation of the scheme under review. However, if it were the case that a Stated Preference experiment could say nothing worthwhile in such circumstances, then there would be little point in conducting them. There has, consequently, recently been much clamour for SP validation studies. Some commentators (e.g. Daly, 1996) have come close to suggesting that SP should no longer be used for forecasting on its own. This paper will attempt to show that useful SP forecasts are possible. In particular, a case study example relating to Anglo-Continental freight mode choice, Before and After the opening of the Channel Tunnel, will be presented.
Where both Revealed and Stated Preference data have been collected in a single interview, there has been good agreement between their estimates of attribute valuations (see, for example, Wardman, 1988). This is very strong evidence in support of the use of Stated Preference methods where RP methods are too expensive or are otherwise ruled out. In freight studies, data on actual choices is usually commercially very sensitive. In addition, RP cannot deal with modes that do not yet exist, as in our case study Before Survey, when the Channel Tunnel was not yet open. When it comes to forecasting, unadjusted use of SP and RP results from data eoUected at the same time do give different predictions. Analysis has shown that this is overwhelmingly due to differences in estimated scale factors. In logit models, random variability in choices, for example due to day to day whim or due to non-modelled attributes, causes all the estimated attribute parameters to be scaled by an unknown amount. The greater the proportion of measured utility variation that is due to the random error, as opposed to the modelled attributes, the smaller are the estimated parameters. When valuing attributes, such as scheduled journey time or delay, in money terms, we divide the appropriate parameter estimate by the parameter estimate for cost. In that way, the scale factors exactly cancel out and so there is no problem when deriving valuations. To forecast the probability of choosing a particular alternative, for example a newly introduced mode, we need the parameter estimates themselves, and so there is a problem.
It is usually held that respondents will have greater difficulty in making their SP responses than they will have in reaching real life decisions. The effect of this would be to depress the size of the estimated SP parameters. This in turn makes the modelled parameters less influential at the forecasting stage, with random error being given too prominent a role. At the extreme, if the parameter estimators became vanishingly small, we would only have 'error' and no information at all. In that circumstance, we would have no alternative but to allocate equal probability to each mode, i.e. if there were four modes we would predict shares of 0.25 regardless of their attributes. The general effect of overestimating the error term, as has often been alleged for SP data, is to artificially raise the predicted shares for modes with below average shares and lower the predicted shares for modes with above average shares. In other words, predicted modal shares are all pulled towards equal shares.
The above problem can be circumvented in several ways, but here we shall consider allocating each respondent to that mode modelled to have highest utility (or, equivalently, least generalised cost). In our case study we are in an espeeiaUy advantageous position, since we have separate models for each respondent. Since the scaling is the same in the utility function for each mode, the predicted utilities for each mode on the basis of the modelled attributes will be (on the usual assumption) underestimated by exactly the same amount. Hence, if the Ferry utility is underestimated by 5%, then the 'Le Shuttle' utility will also be underestimated by 5%, for that respondent. Consequently, if we work through respondent by respondent, looking for the alternative with the highest predicted utility, then we need not bother about the scale factor effect, and should reach conclusions in line with actual choice behaviour. We will follow that approach.
Association for European Transport