Car Demand Forecasting Using Pseudo Panel Method

Car Demand Forecasting Using Pseudo Panel Method


B Huang, MVA Ltd., UK


The paper presents a fresh attempt in car demand forecasting. A pseudo panel dataset was constructed from the UK Family Expenditure Survey, and various non-linear forecasting models, together with saturation level, have been estimated.


The international literature review reveals that the static approach dominates car ownership forecast. It is envisaged that the inclusion of the dynamic in car demand forecasting will yield fruitful results. Nevertheless, the use of dynamic approach in car demand forecasting is still limited due to heavy data requirement. Due to data constraint, there have been relatively few forecasting models that use the dynamic approach except some using aggregate time series methods. It is possible to forecast car demand using panel data model. However, there is only one panel survey in Britain containing limited transport related information: the British Household Panel Survey (BHPS), which is inadequate for a forecasting model. Furthermore, due to the attrition problem, the size and representativeness of the samples decline over time, rendering the panel data inferior to other national cross-sectional data.

One approach to circumvent the need for panel data is to construct pseudo panels from the cross sectional data. The pseudo-panel approach is a relatively new econometric approach to estimate dynamic demand models. A pseudo-panel is an artificial panel based on (cohort) averages of repeated cross-sections. Extra restrictions are imposed on pseudo-panel data before one can treat it as actual panel data. By defining the cohorts one should pursue homogeneity within the cohorts and heterogeneity between the cohorts. In this way, one is able to overcome the deficiencies in both the static models and aggregate time series. In recent years, there have been several studies on pseudo panel models of car ownership (e.g. Dargay and Vythoulkas, 1999). However, these models are usually analytical models, which might not be suitable for forecasting purpose.

In the current study, a pseudo panel dataset is constructed using the Family Expenditure Survey Data in the UK. The cohorts are defined on the basis of one common shared characteristic: year of birth of the head of the household, which is time- invariant. The birth cohort is defined in a five-year band. For example, all the households with its head born between 1901 and 1905 are grouped into a cohort for each sampling year. Likewise, all the households with its head born between 1906 and 1910 are grouped into a cohort; and for those born between 1911 and 1915, and so on. In this way, it is possible to track the notionally ?same? group of people. In total, the constructed pseudo panel has 254 observations, covering 19 years from 1982 to 2000.

One common problem of pseudo panel model estimation is that the intercept term varies across cohorts, since the individual making up each cohort are not the same in every year. This problem can be circumvented by error-in-variables estimator following Deaton (1985), or by simply assuming an equal constant for all households within one cohort. In this study, a major departure from previous studies is the estimation of non-linear models. Although non-linearity might not be a problem for the analytical purpose, it poses a serious problem for long term forecasting. A linear model cannot accommodate the saturation effect of car ownership and would result in the over-estimation of the car ownership in the long run.

RESET Tests of the various linear models constructed using the pseudo panel dataset reveal the presence of non-linearity. Various non-linear models have been estimated. The logistic models are directly estimated using Non-linear Least Square (NLSQ) methods. The saturation level is estimated using two different approaches: first directly using NLSQ and second using the DOGIT model as proposed by Daly (1999). Models with other functional forms, such as Gompertz and Weibull, can be estimated using maximum likelihood methods.

To aggregate up to get a forecast of car ownership, it is necessary to forecast the number of households and other economic/demographic variables (such as income and household size) for each cohort. As most data are available at person level, assumptions have been made regarding the household structure, which enables the conversion of person data to household data. Furthermore, there are cohorts unable to model (those with younger head of household and do not have enough observations in the Family Expenditure Survey). This problem can be solved by analysing the relationship between the cohort constants and assuming homogeneity between other regression parameters.

This project is part of my PhD research at the Economic Department, Birkbeck College. It started from a scoping study on dynamic approach on car demand forecasting I conducted when I was at the Transport Studies Unit, University of Oxford. Research done in the past two years has yielded some successful results.


Association for European Transport