Home | Business News | Browse by Publication | J | Journal of the American Statistical Association

The generalized dynamic factor model: one-sided estimation and forecasting.

Publication: Journal of the American Statistical Association
Publication Date: 01-SEP-05
Format: Online
Delivery: Immediate Online Access

Article Excerpt
1. INTRODUCTION

Economists and forecasters currently typically have access to information scattered through a huge number of observed aggregated and disaggregated economic time series. Intuition suggests that concentrating on a few series, and thus disregarding potentially relevant information; or performing "naive" aggregation, always produces suboptimal forecasts; the more scattered the information, the more severe this loss of forecasting efficiency. Yet most multivariate forecasting methods in the literature are restricted to vector time series of low dimension and allow for incorporating only a limited number of key variables. Such methods are thus of little help in large panels of time series, where the cross-sectional dimension is often of the same order as, or even larger than, the series lengths.

As a solution to this large-size problem, recent literature has given much attention to dynamic factor models, whose main features are (a) an infinite number of cross-sectional units; (b) a decomposition of each observed variable [x.sub.it] in the panel into two mutually orthogonal unobservable components, the common component [[chi].sub.it] and the idiosyncratic component [[xi].sub.it]; (c) a small dynamic dimension of the common components [[chi].sub.it], which are determined by dynamic loading of a finite number, q, of common factors; and (d) a "weak correlation structure" (a notion defined more precisely later) of the idiosyncratic components [[xi].sub.it], which need not be (in contrast with traditional factor models) mutually orthogonal across the panel.

Adopting a parametric approach, Quah and Sargent (1993) estimated by maximum likelihood such a large cross-sectional model, under the restriction of orthogonal idiosyncratic components. Doz, Giannone, and Reichlin (2003) implemented a maximum likelihood estimator (MLE) by forcing orthogonality among the idiosyncratic components, and showed that the impact of the resulting misspecification is negligible as the cross-section size tends to infinity.

Weakly correlated idiosyncratic components are dealt with directly in the nonparametric approach adopted by Forni and Reichlin (1998), Forni, Hallin, Lippi, and Reichlin (2000, 2004), Forni and Lippi (2001), and Stock and Watson (2002a,b). The approach of Stock and Watson (SW hereinafter), based on principal components of contemporaneous values of the x's, can also be used for forecasting. The approach of Forni, Hallin, Lippi, and Reichlin (FHLR hereinafter), based on frequency-domain principal components and thus on two-sided filtering of the x's, although more efficient than SW's for estimation of the common components (see FHLR 2000) is not directly suitable for prediction.

In the present article, still in a nonparametric spirit, we combine the advantages of the FHLR and SW methods to propose a new predictor. Following those previous methods, we start with the observation that the forecast of any of the x's can be obtained as the sum of the forecasts of the common and the idiosyncratic components, each based on its own past values. The idiosyncratic component, being mildly cross-correlated, can be predicted by means of traditional univariate or low-dimensional forecasting methods. Thus we concentrate on prediction of the common components [[chi].sub.it]. Such prediction is obtained by first estimating the factor space by linear combinations of the x's. As the cross-section size tends to infinity, the idiosyncratic components (being poorly correlated) cancel out, and the factor space is approached. The predictor is then obtained by projecting future values of the [chi]'s on the estimated factor space.

The novelty of this article lies both in the estimation of the factor space and in the way in which projections onto this space are performed. We proceed in two steps. The first step uses the dynamic techniques of FHLR (2000) to obtain estimates of the covariance matrices of common and idiosyncratic components. In the second step, these covariances are used to produce two estimations:

(A) A new estimation of the factor space. We use generalized eigenvectors associated with the estimated covariance matrices of common and idiosyncratic components to obtain (unlike in FHLR 2000) linear combinations--referred to as generalized principal components--of contemporaneous x's with minimum idiosyncratic--common variance ratio.

(B) A new estimation of the projection of future values of the [chi]'s on the factor space, based on the estimated lagged covariance matrices of the [chi]'s.

Both our two-step predictor and SW's predictor are consistent, in the sense that, as the cross-section size n and the number of time observations T tend to infinity, both predictors tend in probability to the population-optimal predictor. However, we show that our predictor outperforms SW's predictor in simulations as well as on SW's own dataset. Intuitively, our predictor indeed has a twofold advantage over SW's.

First, whereas SW's estimation of the h-step-ahead projection matrix is based on the lag-h covariance matrix of the x's, our method uses the first-step estimate of the lag-h covariance matrix of the common components. Such an estimate is based on the frequency domain principal components (as in FHLR 2000), which allow efficient aggregation of variables that may be out of phase, so that the information contained in all cross-covariances of the x's, both lagged and contemporaneous, is fully exploited to obtain the h-step-ahead projection matrix.

Second, our generalized principal components method performs better than SW's standard principal components method in approaching the factor space, because it exploits preliminary estimation of the contemporaneous covariance matrices of common and idiosyncratic components. Roughly speaking, our first step enables us to place smaller weights on variables with larger idiosyncratic components, so that the idiosyncratic "error" contained in the linear combination is minimized.

The article is organized as follows. In Section 2 we set up the model and the main assumptions. In Section 3 we provide a detailed presentation of the two-step method and our predictor. In Section 4 we prove consistency. In Section 5 we compare our two-step method with SW's method using simulated panels, and briefly report the results of an in-depth comparison based on the empirical panel used by SW (2002b). We conclude in Section 6. Some mathematical results, needed in Section 4, are given in the Appendix.

2. THE MODEL

The model used in this article is an approximate factor model, in that the idiosyncratic component are allowed to be weakly correlated, as was done by Chamberlain (1983) and Chamberlain and Rothschild (1983) and in contrast to the approaches of Sargent and Sims (1977), Geweke (1977), and Quah and Sargent (1993). It is a dynamic factor model in that the common factors are loaded through a lag structure, as was done by FHLR (2000), Forni and Lippi (2001), SW (2002a,b), Bai and Ng (2002), and Bai (2003). However, unlike the approaches of FHLR (2000) and Forni and Lippi (2001), the lag structure is assumed to be finite.

Denote by X = ([x.sub.it])[.sub.i=1,...,n; t=1,...,T] an n X T rectangular array of observations. Throughout, we assume the following:

A1. X is a finite realization of a real-valued stochastic process {[x.sub.it] [member of] [L.sub.2]([OMEGA], F, P), i [member of] N, t [member of] Z}, where all n-dimensional vector processes {[x.sub.t] = ([x.sub.1t] ... [x.sub.nt])', t [member of] Z}, n [member of] N, are stationary, with mean and finite second-order moments, [[GAMMA].sub.k] = E[[x.sub.t][x'.sub.t-k]], k [member of] N.

The spectral techniques to be used in the sequel also require the following technical assumption:

A2. For all n [member of] N, the process {[x.sub.t], t [member of] Z} admits a Wold representation [x.sub.t] = [[summation].sub.k=0.sup.[infinity]][C.sub.k][w.sub.t-k], where the full-rank n-dimensional innovations [w.sub.t] have finite moments of order 4 and the n X n matrices [C.sub.k] = ([C.sub.ij,k]) satisfy [[summation].sub.k=0.sup.[infinity]]|[C.sub.ij,k]|[k.sup.1/2] < [infinity] for all n, i, j [member of] N.

We refer to assumptions A1 and A2 jointly as assumption A.

To avoid heavy notation, we do not make explicit the dependence on n of the vectors [x.sub.t] and [w.sub.t], of the matrices [[GAMMA].sub.k] and [C.sub.k], and of many other scalar, vector, and matrix quantities defined later. In the same way, we avoid explicit reference to T for estimated quantities. For example, we denote an estimate of [[GAMMA].sub.k], which depends on n and T, by [^.[GAMMA].sub.k].

The basic idea in dynamic factor analysis is that each process [x.sub.it], i [member of] N, is the sum of a common component, [[chi].sub.it], and an idiosyncratic component, [[xi].sub.it]. The common component is driven by a q-dimensional vector of common factors, [f.sub.t] = ([f.sub.1t] [f.sub.2t] ... [f.sub.qt])', which are loaded with possibly different coefficients and lags,

[[chi].sub.it] = [b.sub.i1](L)[f.sub.1t] + [b.sub.i2](L)[f.sub.2t] + ... + [b.sub.iq](L)[f.sub.qt].

Note that q is independent of n (and small compared with n in empirical applications). In vector notation, defining [[chi].sub.t] = ([[chi].sub.1t] ... [[chi].sub.nt])' and [[xi].sub.t] = ([[xi].sub.1t] ... [[xi].sub.nt])', and letting B(L) denote the n X q matrix...

View this article FREE - Now for a Limited Time, try Goliath Business News
Free for 3 Days!



More articles from Journal of the American Statistical Association
Nonparametric inferences for additive models., September 01, 2005
Semiparametric regression analysis of longitudinal data with informati..., September 01, 2005
Dynamical correlation for multivariate longitudinal data., September 01, 2005
Estimation of long memory in the presence of a smooth nonparametric tr..., September 01, 2005
Measurement error in linear autoregressive models., September 01, 2005

Looking for additional articles?
Search our database of over 3 million articles.

Looking for more in-depth information on this industry?
Search our complete database of Industry & Market reports by text, subject, publication name or publication date.

About Goliath
Whether you're looking for sales prospects, competitive information, company analysis or best practices in managing your organization, Goliath can help you meet your business needs.

Our extensive business information databases empower business professionals with both the breadth and depth of credible, authoritative information they need to support their business goals. Whether it be strategic planning, sales prospecting, company research or defining management best practices - Goliath is your leading source for accurate information.