Show Summary Details

Page of

date: 14 December 2019

# Frequency-Domain Approach in High-Dimensional Dynamic Factor Models

## Abstract and Keywords

High-Dimensional Dynamic Factor Models have their origin in macroeconomics, precisely in empirical research on Business Cycles. The central idea, going back to the work of Burns and Mitchell in the years 1940, is that the fluctuations of all the macro and sectoral variables in the economy are driven by a “reference cycle,” that is, a one-dimensional latent cause of variation. After a fairly long process of generalization and formalization, the literature settled at the beginning of the year 2000 on a model in which (1) both $n$ the number of variables in the dataset and $T$, the number of observations for each variable, may be large, and (2) all the variables in the dataset depend dynamically on a fixed independent of $n$, a number of “common factors,” plus variable-specific, usually called “idiosyncratic,” components. The structure of the model can be exemplified as follows:

$Display mathematics$
(*)

where the observable variables $xit$ are driven by the white noise $ut$, which is common to all the variables, the common factor, and by the idiosyncratic component $ξit$. The common factor $ut$ is orthogonal to the idiosyncratic components $ξit$, the idiosyncratic components are mutually orthogonal (or weakly correlated). Lastly, the variations of the common factor $ut$ affect the variable $xit$ dynamically, that is through the lag polynomial $αi+βiL$. Asymptotic results for High-Dimensional Factor Models, particularly consistency of estimators of the common factors, are obtained for both $n$ and $T$ tending to infinity.

Model $(∗)$, generalized to allow for more than one common factor and a rich dynamic loading of the factors, has been studied in a fairly vast literature, with many applications based on macroeconomic datasets: (a) forecasting of inflation, industrial production, and unemployment; (b) structural macroeconomic analysis; and (c) construction of indicators of the Business Cycle. This literature can be broadly classified as belonging to the time- or the frequency-domain approach. The works based on the second are the subject of the present chapter.

We start with a brief description of early work on Dynamic Factor Models. Formal definitions and the main Representation Theorem follow. The latter determines the number of common factors in the model by means of the spectral density matrix of the vector $(x1tx2t⋯xnt)$. Dynamic principal components, based on the spectral density of the $x$’s, are then used to construct estimators of the common factors.

These results, obtained in early 2000, are compared to the literature based on the time-domain approach, in which the covariance matrix of the $x$’s and its (static) principal components are used instead of the spectral density and dynamic principal components. Dynamic principal components produce two-sided estimators, which are good within the sample but unfit for forecasting. The estimators based on the time-domain approach are simple and one-sided. However, they require the restriction of finite dimension for the space spanned by the factors.

Recent papers have constructed one-sided estimators based on the frequency-domain method for the unrestricted model. These results exploit results on stochastic processes of dimension $n$ that are driven by a $q$-dimensional white noise, with $q, that is, singular vector stochastic processes. The main features of this literature are described with some detail.

Lastly, we report and comment the results of an empirical paper, the last in a long list, comparing predictions obtained with time- and frequency-domain methods. The paper uses a large monthly U.S. dataset including the Great Moderation and the Great Recession.

# Introduction

A prerequisite for the present chapter is a basic knowledge of the frequency-domain analysis of weakly stationary stochastic processes. In particular, the reader should know how a function $f$, defined for the frequency $θ$ in the domain $[−π,π]$, is transformed (via its Fourier expansion) into a filter $f_(L)$, where $L$ is the lag operator. However, the main definitions and results in sections “General Assumptions and Representation Theorem”, “Estimation of the General Model: Two-Sided Filters”, and “The Finite-Dimension Assumption and the Static Method” are illustrated by an elementary example, referred to as Example 1. Readers not expert with frequency-domain methods can use the example to understand the frequency-domain approach and the difference with respect to the time-domain approach.

In section “Early Literature on Dynamic Factor Models” we briefly review papers introducing and developing the idea that all the main aggregate indicators of the state of an economy are driven by a small number of common shocks (common factors), plus variable-specific (idiosyncratic) components, and that different indicators load the common factors with different weights and different lag polynomials.

In section “General Assumptions and Representation Theorem” we set the basic definition of the High-Dimensional Dynamic Factor Model and state the main Representation Theorem. The observable variables $xit$ are driven by $q$ common dynamic factors if and only if the first $q$ eigenvalues of the spectral density of the $x$’s diverge whereas the $(q+1)$-th is bounded. The theorem is illustrated by means of Example 1. Applications of the model to forecasting and structural analysis are also introduced.

In section “Estimation of the General Model: Two-Sided Filters” we introduce the frequency-domain principal components, or dynamic principal components, of the $x$’s and show how to construct estimators for the spectral density of the common components $χit$ and the common components $χit$ themselves. Forni, Hallin, Lippi, and Reichlin (2000) introduced this estimator and showed that its performance is very good using simulated data. However, as Example 1 shows, this estimator uses a two-sided filter on the observable variable $xit$, a feature that makes it unsuitable for prediction. Alternative estimators, overcoming this difficulty and still based on the frequency-domain approach, are presented in sections “Frequency-Domain Approach in the Finite-Dimensional Case” and “Estimation of the General Model: One-Sided Filters.”

In section “The Finite-Dimension Assumption and the Static Method” we compare the static method, based on standard (static) principal components, and the frequency-domain (dynamic) method. The former produces one-sided estimators and predictors, but requires that the space spanned by the common components $χit$ be finite-dimensional. We argue that this condition fails even in elementary cases.

An application of the frequency-domain method in the finite-dimensional case is outlined in section “Frequency-Domain Approach in the Finite-Dimensional Case.” The estimated spectral density of the variables $χit$ is used to estimate the $χ$’s themselves using generalized static principal components and one-sided filters.

In section “Estimation of the General Model: One-Sided Filters” we outline the results of recent works in which one-sided estimators of the common components, based on the frequency-domain method, are obtained in the general case, that is, without assuming finite dimension for the space spanned by the variables $χit$. This gives a complete solution to the problem outlined in section “Estimation of the General Model: Two-Sided Filters.”

In section “The Static and Dynamic Methods Compared: Results Based on Simulated and Empirical Data” we report results comparing estimation and forecasting by means of the static and the dynamic method, using both simulated and U.S. monthly macroeconomic data.

Section “Conclusions and Open Problems for Future Research” concludes and outlines some ideas for future developments.

# Early Literature on Dynamic Factor Models

The idea that the main indicators of an economy, such as aggregate income, consumption, investment, production, et cetera, are driven by a common factor, plus variable-specific, usually called idiosyncratic, components was introduced in Burns and Mitchell (1946) in the context of Business Cycle descriptive studies. The common factor represents a background cyclical fluctuation. Taking the common factor as a reference, the variables of the economy are classified as lagging, coincident, and leading.

Sargent and Sims (1977) provide a generalization and formalization of Burns and Mitchell’s one-factor model. Given a vector of $n$ observable time series $xit$, $i=1,2,…,n$, they assume that (i) the variables $xit$ are (after suitable transformations) stationary and costationary, (ii) they are driven by a small, as compared to $n$, number of common factors plus idiosyncratic components, and (iii) each of the variables $xit$ loads the common factors with specific weights and lag polynomials. Precisely, denoting by $q$ the number of common factors,

$Display mathematics$
(1)

where:

1. (a) The vector $ut=(u1tu2t⋯uqt)′$, the common factors, is a $q$-dimensional white noise.

2. (b) $L$ is the lag operator and $aij(L)$ is a possibly infinite polynomial in $L$, thus the relationship between the variables $xit$ and the factors is dynamic.

3. (c) The common and the idiosyncratic components are orthogonal at any lead and lag, that is, $ut⊥ξis$ for all $i,t,s$, so that, setting $χit=ai1(L)u1t+ai2(L)u2t+⋯+aiq(L)uqt$, we have $χit⊥ξjs,$ for all $i,j,t,s.$

4. (d) Idiosyncratic components relative to different variables are orthogonal at any lead and lag, that is, $ξit⊥ξjs$ for all $i,j,t,s.$

In the frequency-domain approach adopted in Sargent and Sims (1977), these assumptions produce the following decomposition of the spectral density of the vector $xt=(x1tx2t⋯xnt)$, call it $Σx(θ)$, for $θ∈[−π,π]$,

$Display mathematics$
(2)

where on the right-hand side we have the spectral density of the common and idiosyncratic components. Note that assumptions (d) is crucial for the identification of common and idiosyncratic components. Obviously, without condition (d), the decomposition in equation (2) cannot be unique.

Equation (2) is the basis for Maximum-Likelihood estimation of the factor model; see Sargent and Sims (1977) and Geweke and Singleton (1981). See also Quah and Sargent (1993) for an application to a case with a fairly large $n$ and the use of time-domain techniques.

Factor models with an infinite “cross section,” as opposed to a fixed $n$, were introduced in Chamberlain (1983) and Chamberlain and Rothschild (1983). An infinite cross section has the important consequence that condition (d) can be relaxed into “weak correlation,” while keeping the identification of common and idiosyncratic components (see below, section “General Assumptions and Representation Theorem”, Observation 1). However, Chamberlain and Rothschild did not consider dynamic loading of the factors.

At the end of the nineties, Forni and Reichlin (1996, 1998), Forni and Lippi (1997), and Stock and Watson (1998) introduced Dynamic Factor Models like (1), with an infinite “cross section,” that is, Large-Dimensional Dynamic Factor Models. The latter share with Sargent and Sims (1977) the idea of a small number of common factors and conditions (a), (b), and (c) above. The main difference is that, whereas in Sargent and Sims (1977) the number of variables is possibly large but fixed, in High-Dimensional Dynamic Factor Models asymptotic results are obtained with both the number of observations and the number of variables tending to infinity.

Subsequent work on High-Dimensional Dynamic Factor Models can be broadly classified as belonging to the time- or the frequency-domain approach. The latter, reviewed in the present chapter, has produced a literature which is rather thin as compared to that based on time-domain techniques. This can be partly explained by the fact that High-Dimensional Dynamic Factor Models have been mostly developed as a tool for macroeconometric applications, and, with a few exceptions, the spectral analysis of vector time series does not belong to the typical toolkit of researchers in this field.

# General Assumptions and Representation Theorem

The datasets considered in the literature on High-Dimensional Dynamic Factor Models are of the form

$Display mathematics$
(3)

where $n$, the size of the cross section, is comparable to, or even larger than, $T$. The corresponding theoretical assumption is that $X$ is a finite realization of a countable family of stochastic processes

$Display mathematics$
(4)

We suppose that the vector process $(xi1txi2t⋯ximt)′$ is weakly stationary for all $m$-tuples $i1,i2,…,im$, which implies that all the processes $xit$ are weakly stationary. Moreover, we suppose that $(xi1txi2t⋯ximt)′$ possesses a spectral density matrix.

Define $xnt=(x1tx2t⋯xnt)′$. We denote by $Σnx(θ)$, $θ∈[−π,π]$, the spectral density of the vector $xnt$. Its eigenvalues are real and non-negative. We denote them by

$Display mathematics$

with corresponding normalized row eigenvectors

$Display mathematics$

such that

$Display mathematics$

Assumption 1 There exists an integer $q≥0$ such that for $j≤q$,

$Display mathematics$

for $θ$ almost everywhere in $[−π,π]$. Moreover, there exists a positive real $Q$ such that $λn,q+1x(θ)≤Q$ for all $n∈ℕ$ and $θ$ almost everywhere in $[−π,π]$.

The following representation result is proved in Forni and Lippi (2001; see also Hallin & Lippi, 2013) for a different approach to the representation theory of the dynamic factor model). With no loss of generality we assume that all the stochastic variables $xit$ and their components are zero mean.

Theorem 1 Suppose that Assumption 1 holds with $q>0$. Then there exists a $q$-dimensional orthonormal white noise $ut=(u1tu2t⋯uqt)′$ and square-summable filters $aij(L)$, $i∈ℕ$, $j=1,2,…,q$, such that

$Display mathematics$
(5)

where:

(i) The eigenvalues of $Σnχ(θ)$, denoted by $λnjχ(θ)$, in decreasing order, fulfill

$Display mathematics$

for $j≤q$ and $θ$ almost everywhere in $[−π,π]$.

(ii) Denoting by $λnjξ(θ)$ the eigenvalues of $Σnξ(θ)$, there exists a positive real $W$ such that $λn1ξ(θ)≤W$ for all $n∈ℕ$ and $θ$ almost everywhere in $[−π,π]$.

(iii) $ξit$ and $us$ are orthogonal for all $t,s∈ℤ$.

Conversely, if the processes $xit$ have representation (5) and (i), (ii), (iii) hold, then Assumption 1 holds for the eigenvalues $λnjx(θ)$.

The above result generalizes results proved in Chamberlain (1983) and Chamberlain and Rothschild (1983) for a static model (i.e., a model in which the polynomials $aij(L)$ are constant). The processes $χit$, $ut$, and $ξit$ are called the common components, the common dynamic factors, and idiosyncratic components, respectively.

The main results in the literature on High-Dimensional Dynamic Factor Models are the construction of estimators for the number $q$ of common shocks (see section “Estimation of the General Model: Two-Sided Filters”) and for the components $χit$ and $ξit$ (see sections “Estimation of the General Model: Two-Sided Filters”, “The Finite-Dimension Assumption and the Static Method”, “Frequency-Domain Approach in the Finite-Dimensional Case”, and “Estimation of the General Model: One-Sided Filters”) that are consistent as both $n$ and $T$ tend to infinity.

Observation 1 Condition (ii) is the definition of an idiosyncratic component in this literature. Note that it is an asymptotic definition. It includes the “pure idiosyncratic” case, when the $ξ$’s have the same variance and are orthogonal to one another, but also allows for “weak correlation” among the $ξ$’s. For example, suppose that $ηit$ is purely idiosyncratic and

$Display mathematics$

(correlation with previous neighbor only). It is easily seen that the maximum eigenvalue of the spectral density matrix of the $ξ$’s is bounded as $n$ tends to infinity.

The following minimal example, though extremely simple, is very useful to illustrate both the common-idiosyncratic and the dynamic structure of the model.

Example 1 Let $ut$ be a scalar white noise of unit variance and

$Display mathematics$
(6)

where $ξit$ has unit variance for all $i$, $ξit$ is orthogonal to $us$ and $ξjs$ for all $i≠j$ and all $s$ and $t$. Letting $n$ be even for convenience,

$Display mathematics$
(7)

The first eigenvalue and eigenvector of $Σnx(θ)$ are:

$Display mathematics$

Moreover, for $j>1$, $λnjx(θ)=1/(2π)$. The first eigenvalues of $Σnχ(θ)$ and $Σnξ(θ)$ are $n/(2π)$ and $1/(2π)$ respectively.

The next two subsections introduce the most important applications of the High-Dimensional Dynamic Factor Model. In the first, Forecasting, the structure of the model, originally motivated by economic intuition, is used for the purpose of reducing dimension when the dataset is too large for standard forecasting techniques. The second, Structural Factor Analysis, is a natural development of the seminal ideas described in section “Early Literature on Dynamic Factor Models.”

# Forecasting

Consider again the dataset (3) and assume that we want to predict the variable $x1t$. The use of standard methods, a vector autoregression (VAR) in particular, requires that $n$ is small as compared to $T$. When $n$ and $T$ are of comparable size, regressing the vector of the $x$’s on its lagged values would easily leave zero degrees of freedom. This is what is known as “curse of dimensionality.” On the other hand, if there is reason to assume that the intuition behind factor models, that the $x$’s are driven by a small number of common factors plus idiosyncratic components, is approximately true, then the information contained in the whole dataset can be fully exploited.

To start with, observe that if the components $χ1t$ and $ξ1t$ were observable then separate prediction of each of them, based on their past values, would outperform the prediction of $x1t$ based on its past values. Of course these predictors are unfeasible because the common and idiosyncratic components are unobservable in general. However, we are assuming that $x1t$ belongs to the family (4), so that, using the whole dataset $Xit$, $1≤t≤T$, $1≤i≤n$, consistent estimates for the common and idiosyncratic components can be constructed and used to forecast $x1t$. Some of the predictors proposed are presented in the following sections. Their performance, relative to one another and to univariate or small dimensional methods, depends on the size of $n$ and $T$ and on the distance between the actual dataset and the assumptions ensuring the validity of the consistency results. Literature comparing forecasting results obtained with different factor models is reviewed in sections “Frequency-Domain Approach in the Finite-Dimensional Case” and “The Static and Dynamic Methods Compared: Results Based on Simulated and Empirical Data.”

# Structural Factor Analysis

Forni and Lippi (2001) show that the integer $q$ and the components $χit$ and $ξit$ are identified in a High-Dimensional Dynamic Factor Model. However, the white noise-vector $ut$ and the filters $aij(L)$ in (5) are not. For, if $H$ is an orthogonal matrix, then

$Display mathematics$

where $vt$ is another orthonormal $q$-dimensional white noise and the filters $bij(L)$ are square-summable.

A natural question is whether the shocks $vt$ and the impulse-response functions $bij(L)$ can be identified by means of the techniques used in Structural VAR analysis (SVAR). Stock and Watson (2005) and Forni, Giannone, Lippi, and Reichlin (2009) show that this is possible, provided that the structural shocks are defined as pervasive macroeconomic shocks, that is, shocks that affect all the variables in the economy (common shocks). An interesting feature of this Structural Factor Analysis is that the number of structural common shocks is a property of the economy under consideration, and therefore does not depend on the number of variables of interest, as in SVAR analysis.

# Estimation of the General Model: Two-Sided Filters

Forni et al. (2000) propose an estimator of the common components $χit$ based on the dynamic principal components introduced in Brillinger (1981). The difference between dynamic principal components and the standard, static, principal components, is that the former, given a stochastic vector $zt=(z1tz2t⋯znt)′$, maximize the variance of normalized linear combinations of $zi,t−k$ for $i=1,2,…,n$, $k∈ℤ$, that is, combinations of the variables $zit$ and all their leads and lags, whereas standard principal components are allowed to linearly combine only the current values $zit$.

Formally, given the dataset (3), let $Σ^nx(θ)$ be a consistent estimate of $Σnx(θ)$. We assume for the moment that $q$ is known. Consider the first $q$ eigenvalues and corresponding normalized eigenvectors of $Σ^nx(θ)$: $λ^njx(θ)$, $p^njx(θ).$ Such functions have Fourier expansions

$Display mathematics$
(8)

The corresponding two-sided filters, denoted by $λ^_njx(L)$ and $p^_njx(L)$, are obtained by replacing $e−ikθ$ by $Lk$ in (8).

The $j$-th dynamic principal component of the dataset (3) is defined as

$Display mathematics$

where $Xnt=(X1tX2t⋯Xnt)′.$ As shown in Brillinger (1981, pp. 337–366), the filter $p^_n1x(L)$, corresponding to the first dynamic principal component, solves the problem

$Display mathematics$
(9)

where $a(L)=(a1(L)a2(L)⋯an(L))$ is an $n$-dimensional two-sided filter, $ai(L)=∑k=−∞∞aikLk$, fulfilling $∑i=1n∑k=−∞∞aik2=1$. The filter $p^_n2x(L)$, corrresponding to the second principal component, solves the problem (9), with the additional condition that $a(L)Xnt$ be orthogonal to $P^1s,tx$ for all $t$ and $s$, et cetera.

Now define $p^nx(θ)=(p^n1x′(θ)p^n2x′(Lθ)⋯p^nqx′(θ))′$ and

$Display mathematics$

which is a $q×n$ filter, so that

$Display mathematics$

is a $q×1$ vector for $t=1,2,…,T$. The dynamic projection of $Xnt$ on the first $q$ dynamic principal components, that is, the projection of $Xnt$ on all lags and leads of $P^nj,tx$ is

$Display mathematics$

where $F=L−1$. The $i$ coordinate of this $n$ dimensional vector, that is, the dynamic projection of $Xit$ on the first $q$ dynamic principal components, is a consistent estimate of $χit$, see Forni et al. (2000):

$Display mathematics$
(10)

Example 2 The construction of the estimate $χ^it$ is now illustrated by means of Example 1. The exercise is carried on using the processes $xit$ instead of the realizations $Xit$ and the population spectral density (7) instead of its estimate. Though based on unfeasible estimates the exercise should help the reader to understand the definition of $χ^it$, its consistency and possible two-sidedness of the filters in (10). The first dynamic principal component, based on the first eigenvalue of the spectral density matrix (7), is

$Display mathematics$
(11)

We see that the term containing $ut$ in the right-hand side of (11) grows in variance as $n$, whereas the term containing the idiosyncratic components has unit variance, so that for large $n$ the principal component is almost a multiple of $ut$. Using (10) and (11),

$Display mathematics$

We have:

$Display mathematics$
(12)

Thus: (i) $χ^it$ converges in the mean square to $χt$ as $n→∞$, (ii) the filters defining $χ^it$ load some of the $x$’s at time $t+1$ (for $t$ odd).

In empirical situations, firstly we need an estimate of the number $q$ of dynamic factors. For consistent determination of $q$, as $T$ and $n$ tend to infinity, see Hallin and Lišk.

The two-sided filters appearing in Example 2 are a general feature of the estimator of $χit$ based on dynamic principal components. As a consequence, such estimator can be applied to reconstruct the history of the common components within the sample but is unfit for forecasting. However, an estimator of the spectral density of the $χ$’s can be obtained without explicitly using the two-sided filters in the time domain. From (10),

$Display mathematics$
(13)

where $p^˜nx(θ)$ is the transpose conjugate of $p^nx(θ)$ and $λ^nx(θ)$ is the $q×q$ diagonal matrix with the eigenvalues $λnjx(θ)$ on the main diagonal.

# The Finite-Dimension Assumption and the Static Method

We again use Example 1 to introduce the so-called static approach, which has been very popular in the literature on Dynamic Factor Models. Firstly, equation (6) is transformed into a static representation by defining the factors $F1t=ut$, $F2t=ut−1$, and rewriting (6) as

$Display mathematics$

The factors $Fjt$, $j=1,2$, are usually called static factors, which is somewhat misleading because the vector $(F1t2t)′$ is not a white noise. For, $F2t=F1,t−1$, so that model (6), in which the white noise $ut$ is dynamically loaded by the $x$’s, has been transformed into a model with the static loading of factors which inherit the dynamics of (6). Precisely:

$Display mathematics$

The factors $Fjt$, $j=1,2$, can be estimated by means of the first two standard (static) principal components of the $x$’s. Using again, like in Example 2, unfeasible estimators of the variance-covariance matrix, call it $Γn0x$, of the $x$’s, we have

$Display mathematics$

with the first two eigenvalues $μnjx$ and corresponding (static) eigenvectors $pnjS,x$ and static principal components $PntS,x$:

$Display mathematics$

The static projection of $xit$ on the first two static principal components is

$Display mathematics$

and

$Display mathematics$
(14)

Observation 2 As in Example 2, we are using population, not estimated covariances, our purpose here being only an illustration of the difference between static and dynamic principal components. We see that the first dynamic principal component maximizes the variance of a linear combination of the $x$’s, their lags and leads. The maximum is attained by taking the odd variables at time $t$ and shifting the even variables one period forward, see (11). With the static principal components, shifting is not allowed, so that to obtain the same asymptotic result (see (12) and (14)), we need two different principal components, the first selecting the odd, the second the even variables.

Observation 3 As we have observed in section “Estimation of the General Model: Two-Sided Filters”, the two-sided estimate of the $χ$’s, see (10), is not reliable at the end of the sample, or for prediction. However, as an estimator within the sample it is very good; see the results based on simulated data in Forni et al. (2000).

In general, going back to (5), if the space $St$ spanned by $χit$, $i∈ℕ$, has finite dimension $r$, then, denoting by $(F1tF2t⋯Frt)$, we have

$Display mathematics$
(15)

The finite-dimension assumption, with the resulting static representation (15) and static factors $Fjt$, $j=1,2,…,r$, has been extensively used in the literature on High-Dimensional Dynamic Factor Models. Estimation of the factors is carried on by means of the first $r$ static principal components of the $x$’s:

$Display mathematics$

The common components are estimated by projecting the $x$’s on the estimated factors:

$Display mathematics$
(16)

The static, or finite-dimensional, approach has been introduced and studied in Stock and Watson (2002a, 2002b), Bai and Ng (2002). For the determination of the number $r$ of static factors, see Bai and Ng (2002), Onatski (2010), Alessi, Barigozzi, and Capasso (2010), and Ahn and Horenstein (2013). As the projection (10) only uses current values of the factors, which in turn are obtained by means of linear combinations of current values of the $x$’s, the static approach produces one-sided estimators for the common components.

Moreover, predictors of the variables $xit$ are obtained by projecting the variables $xi,t+h$ on the estimated factors $F^jt$, $j=1,2,…,r$, plus possibly lags of the factors. For applications to U.S. and Euro Area macroeconomic datasets, see Stock and Watson (2002a), Boivin and Ng (2005), and Schumacher (2007).

However, the finite-dimension assumption rules out extremely simple cases. For example, consider

$Display mathematics$
(17)

where $αi$ is drawn randomly from the uniform distribution between $−a$ and $a$, with $0. In this case

$Display mathematics$

and it is easy to see that the space $St$ spanned by $χit$, $i∈ℕ$, is not finite-dimensional. For, if the dimension of $St$ were a finite integer $r$, for $n>r$ the $n×∞$ matrix

$Display mathematics$

should be of rank $r$. But the determinant of the $n×n$ matrix on the right (the Vandermonde determinant) is zero only if two of the $α$’s are equal, which is false with probability one.

Summing up, dynamic principal components can be applied to estimate the common components irrespective of whether the space $St$ spanned by $χit$, $i∈ℕ$, is finite or infinite-dimensional. Their disadvantage is that they produce two-sided filters. A solution of this problem, obtaining a one-sided estimator in the infinite-dimensional case, is presented in section “Estimation of the General Model: One-Sided Filters.”

# Frequency-Domain Approach in the Finite Dimensional Case

In section “Estimation of the General Model: Two-Sided Filters” we have obtained an estimate $Σ^nχ(θ)$ of the spectral density of the common components; see (13). Based on $Σ^nχ(θ)$, an estimate of the autocovariance functions $Γnkχ=E(χntχ′nt)$ and $Γnkξ=E(ξntξ′nt)$ can be obtained by a discrete version of

$Display mathematics$
(18)

see (27) in section “Estimation of the General Model: One-Sided Filters.” Now assume that the space spanned by the common components $χit$ is finite dimensional. The information contained in $Γ^n0χ$ and $Γ^n0ξ$ can be used to more efficiently estimate the factor space. For example, instead of the first eigenvector of $Γ^n0x$, which solves $max[var(xt)]=maxaΓ^n0xa′$, s.t. $aa′=1$, we can use the first generalized eigenvector of the couple $(Γ^n0χ,Γ^n0ξ)$, which solves $max[var(χ^t)]=maxaΓ^n0χa′$, s.t. $aΓ^n0ξa′=1$, thus, roughly speaking, putting more weight on the variables with a big common component and penalizing those with a big idiosyncratic component.

The estimated factors are defined as follows. Let $ν^n1,ν^n2,⋯,ν^nr$ be the first $r$ solutions, in decreasing order, of the generalized eigenvalue problem

$Display mathematics$

and $p^nkG,x$, $k=1,2,…,r$, the corresponding generalized eigenvectors:

$Display mathematics$

Then:

$Display mathematics$

for $j=1,2,…,r$ and

$Display mathematics$

The estimator just outlined was introduced in Forni, Hallin, Lippi, and Reichlin (2005). Although it assumes a finite number of static factors, like in the standard static method, the dynamic structure of the dataset is exploited in the calculation of the covariance matrices $Γ^kχ$ and $Γ^kξ$. Predictions can be obtained in the same way as with the static approach, by projecting $xi,t+h$ on factors at time $t$ and possibly lags of the factors. For applications to U.S. and Euro Area datasets and comparisons between the frequency-domain method and the method based on standard principal components (see section “The Finite-Dimension Assumption and the Static Method”); see den Reijer (2005), Schumacher (2007), and D’Agostino and Giannone (2012). The frequency-domain approach, jointly with the finite-dimension assumption, is used in Altissimo, Cristadoro, Forni, and Lippi (2010) to construct an indicator of the Business Cycle for the Euro Area.

# Estimation of the General Model: One-Sided Filters

As we have seen in section “Estimation of the General Model: Two-Sided Filters”, the estimator of the common components $χit$ based on dynamic principal components is two-sided. A one-sided estimator obtained by using the frequency-domain approach has been shown in section “Frequency-Domain Approach in the Finite-Dimensional Case”, but only for the finite-dimensional case.

One-sided estimators for the general model (5) have been obtained in Forni, Hallin, Lippi, and Zaffaroni (2015, 2017). Start with the second equation in (5):

$Display mathematics$
(19)

When $n$ is large with respect to $q$, this is a moving average representation of a highly singular vector, with dimension $n$ and rank $q$.

Assumption 2 The filters $aij(L)$ in (19) are rational:

$Display mathematics$

(we suppose for simplicity that the degrees of $bij(L)$ and $cij(L)$ are independent of $i$ and $j$). Elaborating upon recent results by Anderson and Deistler (2008), Forni et al. (2015) prove that.

Theorem 2 Under mild assumptions, which include Assumption 2, for generic values of the parameters $bif,k$ and $cif,k$ (i.e., apart from a lower-dimensional subset in the parameter space, see Forni et al. (2015) for details), supposing for simplicity that $n=m(q+1)$, the $n$-dimensional idiosyncratic vector $χnt=(χ1tχ2t⋯χnt)′$ admits an autoregressive representation with block structure of the form

$Display mathematics$
(20)

where $Aj(L)$ is a $(q+1)×(q+1)$ polynomial matrix with finite degree, not exceeding $qs1+q2s2$, and $Rj$ is $(q+1)×q$. The matrices $Aj(L)$ are unique.

Example 3 To provide an intuition for the result in Theorem 2, that generically singular autoregressive–moving-average (ARMA) models possess a finite autoregressive representation, let us consider the following elementary example, in which $n=2$, $q=1$, and

$Display mathematics$
(21)

with parameters $(a1,b1,a2,b2)$ in $ℝ4$. Outside of the lower-dimensional subset of $R4$ where $a1b2−a2b1=0$, we obtain

$Display mathematics$
(22)

Using (22) to get rid of $vt−1$ in (21), we obtain the AR(1) representation

$Display mathematics$
(23)

where $d=1/(a1b2−a2b1)$. Note that If $a1b2−a2b1=0$, no finite-degree autoregressive representation exists, unless $b1=b2=0$. Quite obviously, $a1b2−a2b1≠0$ if and only if $z1t−1$ and $z2t−1$ are linearly independent. Therefore, generically, the projection (23) is unique, that is, no other autoregressive representation of degree one exists.

Observation 4 The assumption that $q$ stays constant as $n$ grows implies that no curse of dimensionality arises with the VAR representation (20). For the number of parameters in the block-diagonal matrix $A(L)$ grows at the speed of $n$ as $n→∞$, whereas the number of parameters for, say, an unrestricted VAR(1) for $xnt$ would grow at the speed of $n2$.

Applying the block-diagonal matrix $An(L)$ to both sides of $xnt=χnt+ξnt$ and using (20), we obtain $An(L)xnt=Rnut+ξnt$, that is, setting $ynt=An(L)xnt$,

$Display mathematics$
(24)

which is a static factor model with static factors $ut$. Thus, if we are able to construct an estimate of $An(L)$, estimates of $ut$ and $Rn$ can be obtained by static principal components. Then

$Display mathematics$
(25)

The estimation procedure used in Forni et al. (2017) for the matrices $Aj(L)$, $Rj$, and $χit$ is outlined below. The whole procedure is used in Forni et al. (2017), the first steps in Forni et al. (2000), and Forni et al. (2005).

(I) The spectral density of $xt$ is estimated by means of a lag-window estimator

$Display mathematics$

where: $Γ^kx$ is the estimated covariance between $xt$ and $xt−k$, $K$ is a Kernel function, $BT$ is the bandwidth parameter, and $2BT+1$ is the size of the lag window. Under assumptions specified in Forni et al. (2017), Liu and Wu (2010), and Wu and Zaffaroni (2017) show that the estimated spectral density is consistent uniformly with respect to the frequency $θ∈[−π,π]$. This property, which is crucial to prove consistency of $χ^it$, had been assumed in previous papers based on the frequency-domain approach.

(II) Secondly, the criterion proposed in Hallin and Liška (2007) is applied to determine the number $q$ of common shocks.

(III) An estimate of the spectral density of $χnt$ is obtained by equation (13), that is:

$Display mathematics$
(26)

The eigenvectors $p^nx(θ)$ are used to construct the estimator of $χit$ in Forni et al. (2000), see section “Estimation of the General Model: Two-Sided Filters.”

(IV) An estimate of the autocovariance function of $χt$ is obtained as

$Display mathematics$
(27)

where $θh=πh/BT$. This is the discrete version of (18). Under the finite-dimension assumption, the autocovariance function (27) is used to construct the estimator of $χit$ in Forni et al. (2005), see section “Frequency-Domain Approach in the Finite-Dimensional Case.”

(V) Using the estimated autocovariance function, $Γ^nkχ$, the standard VAR calculation of the coefficients of the polynomial matrices

$Display mathematics$

can be implemented for any given $s$. To fix ideas let $j=1$, that is, consider the first $(q+1)$-dimensional block and the matrix polynomial

$Display mathematics$
(28)

Define

$Display mathematics$

where $Σ^11χ(θ)$ is the estimated spectral density of the block $χt1=(χ1tχ2t⋯χq+1,t)′$, that is, the upper-left $(q+1)×(q+1)$ submatrix of the spectral density (26). Now define:

$Display mathematics$

and

$Display mathematics$

Then:

$Display mathematics$

and therefore (28).

(VI) The estimates $u^t$ and $R^k$ can be obtained by standard principal components of $ynt=A^n(L)$, see (24).

(VII) The covariance matrix of $R^u^t$ can be used to determine the maximum lag $s$ by means of an information criterion.

(VIII) The resulting estimator and $h$-step ahead predictor for the common components are

$Display mathematics$

see (25), and

$Display mathematics$

respectively. Estimators for the idiosyncratic components are $ξ^nt=xnt−χ^nt$. Univariate models for each of the estimated idiosyncratic components can be used to obtain predictions $ξ^n,t+h|t$ and therefore $x^n,t+h|t=χ^nt+h|t+ξ^nt+h|t.$

# The Static and Dynamic Methods Compared: Results Based on Simulated and Empirical Data

In section “Estimation of the General Model: Two-Sided Filters” we have argued that the finite-dimension assumption, underlying the static method, rules out elementary models. However, we must also point out that, even if the finite-dimension assumption does not hold for the data generating process, with a given dataset (thus given $n$ and $T$) the static method might provide a good approximation and, possibly, outperform the dynamic method both in estimation and forecasting.

From this point of view, the static and the dynamic methods can only be compared by their performance with simulated or empirical data. Forni et al. (2017) produces simulated data from both an infinite- and a finite-dimensional generating process, Models I and II respectively, and compares the results obtained with the static and dynamic method in estimating impulse-response functions and forecasting. When the data are generated by Model I (infinite dimensional), the dynamic method strongly outperforms the static method both for the impulse-response function and forecasting. Unexpectedly, the dynamic method remains the best, though by a little, even when the data are generated by Model II (finite dimensional). This suggests that the autoregressive specification used in the dynamic method, see (20), as opposed to increasing the number of static factors to account for the dynamics in the loadings (see the illustration at the beginning of section “The Finite-Dimension Assumption and the Static Method”), is an advantage for the dynamic method irrespective of whether the underlying data generation process is finite or infinite dimensional.

In a recent paper, Forni et al. (in press) conducts a pseudo real-time forecasting exercise using a large monthly dataset of macroeconomic and financial time series for the U.S. economy, which includes the Great Moderation, the Great Recession, and the subsequent recovery (an update of the so-called Stock and Watson dataset). The target variables are Industrial Production and Inflation and the methods compared are: (i) the standard principal-component model of Stock and Watson (2002a, 2002b; see section “The Finite-Dimension Assumption and the Static Method”), (ii) the model based on generalized principal components (Forni et al., 2005; see section “Frequency-Domain Approach in the Finite-Dimensional Case”), (iii) the model outlined in section “Estimation of the General Model: One-Sided Filters” (Forni et al., 2017), and (iv) a univariate autoregressive (AR) model. Using a ten-year rolling window for estimation and prediction, the paper finds that model (iii) significantly outperforms (i), (ii), and (iv) in the Great Moderation period for both Industrial Production and Inflation, and, though by a little, for Inflation over the full sample. However, (iii) is outperformed by (ii) and (i) over the full sample for Industrial Production. Thus, when the assumptions of stationarity and costationarity (after suitable transformations) underlying the factor models are (approximately) fulfilled by the variables in the dataset, the dynamic method outperforms the static methods, confirming the results obtained with simulated data. However, the results in the crisis period seem to show that (iii) is more sensitive than the other factor models to instability in the covariance structure of the dataset.

Three alternative methods are also considered: (a) a Bayesian approach to forecasting with large datasets (De Mol, Giannone, & Reichlin, 2008), (b) the so-called Three-Pass Regression Filter (Kelly & Pruitt, 2015), and (c) a method based on Maximum-Likelihood estimation of the factors (Doz, Giannone, & Reichlin, 2011). In the Great Moderation period they are outperformed by the dynamic method (iii) for both Industrial Production and Inflation, while in the full sample method (a) slightly outperforms (iii) for Inflation.

# Conclusions and Open Problems for Future Research

The frequency-domain approach to High-Dimensional Dynamic Factor Models is based on the estimation of the spectral density of the observable variables $xit$ and their dynamic principal components. Early work using this approach have obtained estimators of the common components $χit$ that use two-sided filters (Forni et al., 2000). Though accurate within the sample, this estimator is unfit for forecasting. As an alternative, one-sided filters were obtained under the finite-dimension assumption (Forni et al., 2005). Lastly, recent papers have solved the problem in the general case, that is, without assuming finite dimension of the space spanned by the $χ$’s (Forni et al., 2015, 2017).

Based on Forni et al. (2015, 2017) and Forni et al. (in press), promising results have been obtained in forecasting with the U.S. monthly macoeconomic dataset, both in comparison with the standard AR model and with other factor models. However, the forecasting performances of the factor models change during the Great Recession, relative to one another and to the AR method. This is not surprising, as the theoretical results for factor models heavily depend on the assumption of stationarity and co-stationarity of the variables in the dataset. An improvement in the performance of the frequency-domain methods might come from relaxation of the stationarity assumption as in locally stationary time-series models (Dahlhaus, 2009; Motta, Hafner, & Von Sachs, 2011).

Instability, breaks, and estimation of factors and loadings when the assumption of stationarity, either global or local, is not fulfilled and has been studied with time-domain methods for the standard dynamic factor model (see Stock & Watson, 2002a; D’Agostino, Giannone, & Surico, 2007; Banerjee, Marcellino, & Masten, 2008; Stock & Watson, 2009; D’Agostino, Gambetti, & Giannone, 2013; Clements, 2016; and Forni et al., in press). Again, an improvement for predictors and Structural Factor Analysis based on frequency-domain methods might be the result of explicitly taking instability and breaks into account.

As mentioned in section “Frequency-Domain Approach in the Finite-Dimensional Case”, the frequency-domain approach, jointly with the finite-dimension assumption, has been employed to construct an indicator of the Euro Area Business Cycle (Altissimo et al., 2010). Even in this case, the results in Forni et al. (2015, 2017), allowing the relaxation of the finite-dimension assumption, should provide an improvement in the performance of the indicator.

Stock and Watson (2005) and Forni et al. (2009) show that identification of Impulse-Response Functions of the observable variables $xit$ with respect to the structural shocks, based on economic logic, can be obtained in the same way as in Structural VAR analysis (see section “Structural Factor Analysis”). On the other hand, simulation experiments in Forni et al. (2017) show that the frequency-domain method outperforms the static method in the estimation of the Impulse-Response Functions (both when the data are generated with an infinite-dimensional model, like (17), and with a finite-dimensional model). Thus, even with structural analysis, the new methods in Forni et al. (2015, 2017) seem very promising.

Lastly, consistency of a Maximum-Likelihood estimator has been obtained in Geweke and Singleton (1981) for the fixed-$n$ factor model proposed in Sargent and Sims (1977), and for a High-Dimensional Dynamic Factor Model, under the finite-dimension assumption (Doz et al., 2011). Extension to the general High-Dimensional Dynamic Factor Model, though highly desirable, has not yet been studied.

## References

Ahn, S. C., & Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81(3), 1203–1227.Find this resource:

Alessi, L., Barigozzi, M., & Capasso, M. (2010). Improved penalization for determining the number of factors in approximate factor models. Statistics & Probability Letters, 80(23–24), 1806–1813.Find this resource:

Altissimo, F., Cristadoro, R., Forni, M., Lippi, M., & Veronese, G. (2010). New Eurocoin: Tracking economic growth in real time. The Review of Economics and Statistics, 92(4), 1024–1034.Find this resource:

Amengual, D., & Watson, M. W. (2007). Consistent estimation of the number of dynamic factors in a large N and T panel. Journal of Business & Economic Statistics, 25(1), 91–96.Find this resource:

Anderson, B., & Deistler, M. (2008). Properties of zero-free transfer function matrices. SICE Journal of Control, Measurement and System Integration, 1(4), 284–292.Find this resource:

Bai, J., & Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70(1), 191–221.Find this resource:

Banerjee, A., Marcellino, M., & Masten, I. (2008). Forecasting macroeconomic variables using Diffusion Indexes in short samples with structural change. In D. E. Rapach & M. E. Wohar (Eds.), Forecasting in the presence of structural breaks and model uncertainty (pp. 149–194). Bingley, UK: Emerald Group Publishing.Find this resource:

Boivin, J., & Ng, S. (2005). Understanding and comparing factor-based forecasts. International Journal of Central Banking, 1(3), 117–151.Find this resource:

Brillinger, D. R. (1981). Time series: Data analysis and theory. San Francisco, CA: Holden-Day, Inc.Find this resource:

Burns, A. F., & Mitchell, W. C. (1946). Measuring business cycles. Cambridge, MA: NBER.Find this resource:

Chamberlain, G. (1983). Funds, factors and diversification in arbitrage pricing models. Econometrica, 51(5), 1281–1304.Find this resource:

Chamberlain, G., & Rothschild, M. (1983). Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica, 51(5), 1305–1324.Find this resource:

Clements, M. P. (2016). Real-time factor model forecasting and the effects of instability. Computational Statistics & Data Analysis, 100(C), 661–675.Find this resource:

D’Agostino, A., Gambetti, L., & Giannone, D. (2013). Macroeconomic forecasting and structural change. Journal of Applied Econometrics, 28(1), 82–101.Find this resource:

D’Agostino, A., & Giannone, D. (2012). Comparing alternative predictors based on large-panel factor models. Oxford Bulletin of Economics and Statistics, 74(2), 306–326.Find this resource:

D’Agostino, A., Giannone, D., & Surico, P. (2007). (Un)Predictability and macroeconomic stability (CEPR Discussion Paper No. 6594). London, UK: CEPR.Find this resource:

Dahlhaus, R. (2009). Local inference for locally stationary time series based on the empirical spectral measure. Journal of Econometrics, 151(2), 101–112.Find this resource:

De Mol, C., Giannone, D., & Reichlin, L. (2008). Forecasting using a large number of predictors: Is bayesian shrinkage a valid alternative to principal components? Journal of Econometrics, 146(2), 318–328.Find this resource:

Doz, C., Giannone, D., & Reichlin, L. (2011). A quasi–maximum likelihood approach for large, approximate dynamic factor models. Review of Economics and Statistics, 94(4), 1014–1024.Find this resource:

Forni, M., Giannone, D., Lippi, M., & Reichlin, L. (2009). Opening the black box: Structural factor models with large cross sections. Econometric Theory, 25(5), 1319–1347.Find this resource:

Forni, M., Giovannelli, A., Lippi, M., & Soccorsi, S. (in press). Dynamic factor model with infinite dimensional factor space: Forecasting. Journal of Applied Econometrics.Find this resource:

Forni, M., Hallin, M., Lippi, M., & Reichlin, L. (2000). The generalized dynamic-factor model: Identification and estimation. Review of Economics and Statistics, 82(4), 540–554.Find this resource:

Forni, M., Hallin, M., Lippi, M., & Reichlin, L. (2005). The generalized dynamic factor model: One-sided estimation and forecasting. Journal of the American Statistical Association, 100, 830–840.Find this resource:

Forni, M., Hallin, M., Lippi, M., & Zaffaroni, P. (2015). Dynamic factor models with infinite-dimensional factor spaces: One-sided representations. Journal of Econometrics, 185(2), 359–371.Find this resource:

Forni, M., Hallin, M., Lippi, M., & Zaffaroni, P. (2017). Dynamic factor models with infinite-dimensional factor spaces: Asymptotic analysis. Journal of Econometrics, 199(1), 72–92.Find this resource:

Forni, M., & Lippi, M. (1997). Aggregation and the microfoundations of dynamic macroeconomics. Oxford, UK: Oxford University Press.Find this resource:

Forni, M., & Lippi, M. (2001). The generalized dynamic factor model: Representation theory. Econometric Theory, 17(6), 1113–1141.Find this resource:

Forni, M., & Reichlin, L. (1996). Dynamic common factors in large cross-sections. Empirical Economics, 21(1), 27–42.Find this resource:

Forni, M., & Reichlin, L. (1998). Let’s get real: A factor analytical approach to disaggregated business cycle dynamics. Review of Economic Studies, 65(3), 453–473.Find this resource:

Geweke, J. F., & Singleton, K. J. (1981). Maximum likelihood “confirmatory” factor analysis of economic time series. International Economic Review, 22(1), 37–54.Find this resource:

Hallin, M., & Lippi, M. (2013). Factor models in high-dimensional time series–A time-domain approach. Stochastic Processes and their Applications, 123(7), 2678–2695.Find this resource:

Hallin, M. & Liška, R. (2007). Determining the number of factors in the general dynamic factor model. Journal of the American Statistical Association, 102(478), 603–617.Find this resource:

Kelly, B., & Pruitt, S. (2015). The three-pass regression filter: A new approach to forecasting using many predictors. Journal of Econometrics, 186(2), 294–316.Find this resource:

Liu, W., & Wu, W. B. (2010). Asymptotics of spectral density estimates. Econometric Theory, 26(4), 1218–1245.Find this resource:

Motta, G., Hafner, C., & Von Sachs, R. (2011). Locally stationary factor models, Identification and nonparametric estimation. Econometric Theory, 27(6), 1279–1319.Find this resource:

Onatski, A. (2009). Testing hypotheses about the number of factors in large factor models. Econometrica, 77(5), 1447–1479.Find this resource:

Onatski, A. (2010). Determining the number of factors from empirical distribution of eigenvalues. The Review of Economics and Statistics, 92(4), 1004–1016.Find this resource:

Quah, D., & Sargent, T. J. (1993). A dynamic index model for large cross sections. In J. H. Stock & M. W. Watson (Eds.), Business cycles, indicators and forecasting (pp. 285–310). Cambridge, MA: NBER.Find this resource:

den Reijer, A. H. J. (2005). Forecasting Dutch GDP using large scale factor models (DNB Working Paper No. 2005/28). Amsterdam, the Netherlands: De Nederlandsche Bank.Find this resource:

Sargent, T. J., & Sims, C. A. (1977). Business cycle modeling without pretending to have too much a priori economic theory. In C. A. Sims (Ed.), New methods in business cycle research (pp. 145–168). Minneapolis, MN: Federal Reserve Bank of Minneapolis.Find this resource:

Schumacher, C. (2007). Forecasting German GDP using alternative factor models based on large datasets. Journal of Forecasting, 26(4), 271–302.Find this resource:

Stock, J., & Watson, M. (2009). Forecasting in dynamic factor models subject to structural instability. In J. L. Castle & N. Shephard (Eds.), The methodology and practice of econometrics: A festschrift in honour of David F. Hendry (pp. 173–205). Oxford, UK: Oxford University Press.Find this resource:

Stock, J. H., & Watson, M. W. (1998). Diffusion indexes (NBER Working Paper No. 6702). Cambridge, MA: NBER.Find this resource:

Stock, J. H., & Watson, M. W. (2002a). Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association, 97(460), 1167–1179.Find this resource:

Stock, J. H., & Watson, M. W. (2002b). Macroeconomic forecasting using diffusion indexes. Journal of Business & Economic Statistics, 20(2), 147–162.Find this resource:

Stock, J. H., & Watson, M. W. (2005). Implications of dynamic factor models for VAR analysis (NBER Working Paper No. 11467). Cambridge, MA: NBER.Find this resource:

Wu, W. B., & Zaffaroni, P. (2017). Asymptotic theory for spectral density estimates of general multivariate time series. Econometric Theory.Find this resource: