Show Summary Details

Page of

date: 30 October 2020

# Time-Domain Approach in High-Dimensional Dynamic Factor Models

• Marco LippiMarco LippiEconometrics, Einaudi Institute for Economics and Finance

### Summary

High-dimensional dynamic factor models have their origin in macroeconomics, more specifically in empirical research on business cycles. The central idea, going back to the work of Burns and Mitchell in the 1940s, is that the fluctuations of all the macro and sectoral variables in the economy are driven by a “reference cycle,” that is, a one-dimensional latent cause of variation. After a fairly long process of generalization and formalization, the literature settled at the beginning of the 2000s on a model in which (a) both n, the number of variables in the data set, and T, the number of observations for each variable, may be large; (b) all the variables in the data set depend dynamically on a fixed, independent of n, number of common shocks, plus variable-specific, usually called idiosyncratic, components. The structure of the model can be exemplified as follows:

(*)$Display mathematics$

where the observable variables $xit$ are driven by the white noise $ut$, which is common to all the variables, the common shock, and by the idiosyncratic component $ξit$. The common shock $ut$ is orthogonal to the idiosyncratic components $ξit$, the idiosyncratic components are mutually orthogonal (or weakly correlated). Last, the variations of the common shock $ut$ affect the variable $xit$dynamically, that is, through the lag polynomial $αi+βiL$. Asymptotic results for high-dimensional factor models, consistency of estimators of the common shocks in particular, are obtained for both $n$ and $T$ tending to infinity.

The time-domain approach to these factor models is based on the transformation of dynamic equations into static representations. For example, equation ($∗$) becomes

$Display mathematics$

Instead of the dynamic equation ($∗$) there is now a static equation, while instead of the white noise $ut$ there are now two factors, also called static factors, which are dynamically linked:

$Display mathematics$

This transformation into a static representation, whose general form is

$Display mathematics$

is extremely convenient for estimation and forecasting of high-dimensional dynamic factor models. In particular, the factors $Fjt$ and the loadings $λij$ can be consistently estimated from the principal components of the observable variables $xit$.

Assumption allowing consistent estimation of the factors and loadings are discussed in detail. Moreover, it is argued that in general the vector of the factors is singular; that is, it is driven by a number of shocks smaller than its dimension. This fact has very important consequences. In particular, singularity implies that the fundamentalness problem, which is hard to solve in structural vector autoregressive (VAR) analysis of macroeconomic aggregates, disappears when the latter are studied as part of a high-dimensional dynamic factor model.

### Introduction

A prerequisite for this article is a basic knowledge of the time-domain analysis of weakly stationary stochastic processes, including fundamental and nonfundamental representations, and of principal component analysis. All the definitions and results are illustrated by elementary examples in which the number of factors is one or two and the dynamics are limited to moving averages of order one.

The present article is closely linked to Lippi (2018), “Frequency-Domain Approach to High-Dimensional Dynamic Factor Models.” The latter, in the section “Early Literature on Dynamic Factor Models,” contains a short review of the literature on dynamic factor models from the 1940s, with the seminal contribution in Burns and Mitchell (1946), to the 1970s, with Sargent and Sims (1977) and related papers, to the introduction of infinite cross-sections, first in Chamberlain and Rothschild (1983) and Chamberlain (1983) and then in Forni and Reichlin (1996), Forni and Lippi (1997), Stock and Watson (1998), and Forni and Reichlin (1998).

Another important observation regarding the relationship between this article and Lippi (2018) is that “frequency domain” and “time domain” have become customary in the literature on dynamic factor models in reference to the methods outlined in Lippi (2018) and in this article, respectively. However, there is no conceptual difference between the models and the role of the spectral methods is purely technical, as shown in Hallin and Lippi (2013). Much more important is the fact that some simple models cannot be put in the static representation (see later discussion) and therefore cannot be studied with this article’s techniques.

General Assumptions and Representation Theorem” gives the basic definition of the high-dimensional dynamic factor model and state the main representation theorem. A static equation links the $n$ observable variables $xit$ to the $r$ factors $Fjt$ and the idiosyncratic components $ξit$. The representation theorem proves that the number of factors is equal to the number of eigenvalues of the variance-covariance matrix of the x’s, which diverge as $n$ tends to infinity. Thus the unobserved structure of the model is “revealed” by properties of the observable variables. The model and the theorem are illustrated by means of an elementary example.

“Dynamics” shows that a rich variety of dynamic models, driven by a q-dimensional white-noise vector of common shocks $ut$, can be accommodated in the static representation of the previous section. This is done by replacing lags of $ut$ in the primitive form with factors $Fjt$ in the static representation. A consequence is that in general the vector $Ft=(F1t⋯Frt)′$ is singular; that is, its dimension is larger than the dimension of its driving white noise $ut$. A theorem on singular autoregressive–moving-average vector processes ensures that $Ft$ hasafinite-degreevectorautoregressive(VAR)representation. The same theorem implies that $ut$ is fundamental for $Ft$.

“Estimation” provides motivation for the use of principal components to construct estimators of the factor space. The estimation theory for the number of static factors r, for the number of dynamic factors q, and for the factors themselves is reviewed. “Forecasting” presents the forecasting equation implicit in the structure of the dynamic factor model.

The Factor Model and Structural Macroeconomic Analysis” shows that the high-dimensional dynamic factor models can be used for structural macroeconomic analysis. The common and idiosyncratic components of variables like the gross domestic product (GDP), industrial production, inflation, and so on are interpreted as the relevant macroeconomic indicators and measurement errors, respectively. The usual identification techniques of the structural shocks can be applied with no modification. Thus, unlike in structural VARs (SVAR), no fundamentalness problem arises.

Usually, the results on estimation and forecasting of factor models are applied to the data set obtained by transformations by which stationarity is achieved, differencing in particular. However, “Non-Stationary Processes” briefly reviews papers in which the non-stationarity of the data set is directly taken into account. “Conclusions and Open Problems for Future Research” concludes and outlines some ideas for future developments.

As a rule, expressions like $xit$ or $μnjx$ are written without a comma between $n$ and $t$ or $n$ and $n$. However, to avoid possible confusion, this article uses $xi,t+1$, $xi+1,t$, $μn,j+1x$, and so on. The end of “Examples” and “Observations” is marked with the symbol □.

### General Assumptions and Representation Theorem

The data sets considered in the literature on high-dimensional dynamic factor models are of the form

$Display mathematics$(1)

where $n$, the size of the cross-section, is comparable to, or even larger than, $T$. The corresponding theoretical assumption is that $X$ is a finite realization of a double-indexed stochastic process

$Display mathematics$(2)

It is assumed that the vector process ${(xi1t⋯ximt)′,t∈ℤ}$ is weakly stationary for all m-tuples $i1,...,im$, which implies that all the processes $xit$ are weakly stationary. Without loss of generality, it is assumed that all the processes $xit$ are zero-mean.

In empirical applications to macroeconomic data sets, most of the variables are nonstationary, I(1) in particular. Usually, the results on estimation and forecasting of factor models are applied to the data set obtained by transformations by which stationarity is achieved, differencing in particular. However, the section “Non-Stationary Processes” briefly reviews papers in which the non-stationarity of the data set is directly taken into account.

A particular form of the factor model illustrated in this article is the following. Suppose that the observed processes $xit$ can be represented as follows:

$Display mathematics$(3)

where:

(a)

The vector process $Ft=(F1t⋯Frt)′$ is weakly stationary and zero-mean;

(b)

$ξi1t⊥ξi2t$ for all $t∈ℤ$;

(c)

$ξit⊥Ft$.

The unobserved processes $χit$ and $ξit$ are called the common components and the idiosyncratic components of the processes $xit$, respectively. The unobserved processes $Fjt,j=1,…,r$, are called the common factors. The unobserved coefficients $λij$ are called the loadings of the factors. Note that under (b) and (c) the covariance among the x’s is completely accounted for by the covariance among the common components.

Note also that in factor models in which $n$ is a given, finite number, assumption (b) is necessary for identification. As seen later in Theorem 1, a weaker form of (b) is sufficient when $n$ is allowed to tend to infinity.

The most elementary specification of model (3) is the following.

#### Example 1

Let $r=1$, only one common factor, and

$Display mathematics$(4)

where $λ≠0$. For the covariance among the x’s:

$Display mathematics$

By taking the arithmetic mean of the first $n$ processes:$(1/n)∑i=1nxit=λFt+(1/n)∑i=1nξit$, that is

$Display mathematics$

Assuming that the variances of the idiosyncratic components are bounded, $σξi2≤σξ2$, $i∈ℕ$,

$Display mathematics$

Thus, in mean square,

$Display mathematics$(5)

This example is as simple as it is important to grasp the idea leading to estimates for the unobserved components. Here $n$, the size of the cross-section, and $T$, the number of observations for each time series, both tend to infinity. A consequence of $n$ tending to infinity is that the idiosyncratic components are averaged away and the common components are estimated consistently, which is not the case if the model only contains a finite number of processes $xit$.□

#### Observation 1

The discussion in Example 1, the limit in equation (5) in particular, involves population covariances and mean-square convergence. Examples based on population covariances are used in this article only for heuristic purposes; see also the introduction of principal components in the “Estimation” section, for example.□

The result presented here as Theorem 1 states that the processes $xit$ have a representation like (3), with a slightly different definition of the idiosyncratic components, if and only if their covariances satisfy the condition specified in Assumption 1. Thus the existence of a structure like (3) can be established by a criterion implying only properties of the observable processes $xit$.

Define $xnt=(x1tx2t⋯xnt)′$. $Γnkx$ is the covariance between $xnt$ and $xn,t−k$:

$Display mathematics$(6)

The eigenvalues of the variance-covariance matrix $Γn0x$ are real and non-negative. They are denoted by $μnjx,j=1,…,n$, with the assumption that $μnjx≥μn,j+1x,j=1,…,n−1$. In the same way $χnt,ξnt,Γnkχ,Γnkξ$ is defined and the eigenvalues $μnjχ$ and $μnjξ$. Again, this is assuming that $μnjχ≥μn,j+1χ$ and for $j=1,…,n−1$.

#### Assumption 1

There exists an integer $s≥0$ such that $μn,s+1x≤R$ for some positive real $R$ and all $n∈ℕ$.

If Assumption 1 holds, let $r$ be the minimum of all integers s such that $μn,s+1x$ is bounded for $n∈ℕ$. Of course if $r>0$ then $limn→∞μnjx=∞$ for $0.

In Example 1, assuming that $σξi2=σξ2$ for all i, it is easily seen that

$Display mathematics$

with eigenvalues

$Display mathematics$

so that Assumption 1 is fulfilled and $r=1$.

#### Observation 2

Assumption 1 is a fairly serious restriction. A simple example in which no finite s can be determined can be found in Lippi (2018) and the section “The Finite-Dimension Assumption and the Static Method.”

#### Theorem 1

Suppose that Assumption 1 holds and $r>0$. Then there exists an r-dimensional weakly stationary vector process $Ft=(F1t…Frt)′$ and loadings $λij$, $i∈ℕ$, $j=1,...,r$, such that

$Display mathematics$(7)

where:

(i)

$limn→∞μnjχ=∞$, for $j≤r$.

(ii)

There exists a positive real $W$ such that $μn1ξ≤W$ for all $n∈ℕ$.

(iii)

$ξit$and$Ft$ are orthogonal for all $t∈ℤ$. As a consequence $ξit$ and the common components $χjt$ are orthogonal for all $i$ and $j$.

Conversely, if the processes $xit$ have representation (7) with a $r˜$-dimensional vector $F˜t$, and (i), (ii), (iii) hold, then Assumption 1 is fulfilled and $r=r˜$.

This representation theorem was proved in Chamberlain and Rothschild (1983). It is a basic result in the literature on high-dimensional dynamic factor models taking the time-domain approach. Both the criteria to determine the number of factors and the construction of estimators for the common components and factors rely directly or indirectly on Theorem 1 (see “Estimation”).

Condition (ii) is the definition of an idiosyncratic component in this literature. Note that it is an asymptotic definition. It includes the case in which different idiosyncratic component are orthogonal (see assumption (b) earlier) with a bound for the idiosyncratic variances $σξi2$. However, it also includes nondiagonal variance-covariance matrices for the idiosyncratic components. For example, local or sectoral covariances for the idiosyncratic components produce a block-diagonal shape for $Γn0ξ$. Imposing a bound for the maximum eigenvalues of the blocks, condition (ii) is fulfilled.

Theorem 1 can be obviously adapted to $r=0$, with (7) holding with $χit=0$, so that the x’s themselves are idiosyncratic.

#### Observation 3

Condition (i) in Theorem 1 is crucial. It ensures that no representation with a smaller number of factors is possible. For example, suppose that $χit=λi1F1t+λi2F2t$, and that $λi2=αλi1$ for all $i∈ℕ$. Then, of course, setting $F˜t=F1t+αF2t$ produces $χit=λi1(F1t+αF2t)=λi1F˜t$, which is a representation with one factor. It is easy to see that in this case only $μn1χ→∞$, but also that only $μn1χ→∞$, that is, that $r$ is equal to one, not two.

#### Observation 4

It is important to point out that if the infinite-dimensional vector (2) has a factor structure (i.e., if Assumption 1 holds), then $r$, the number of factors, is identified as the number of eigenvalues of $Γn0x$ tending to infinity. Moreover, the components $χit$ and $ξit$ are identified as well; see Chamberlain (1983), Chamberlain and Rothschild (1983), and Forni and Lippi (2001). In particular, the space spanned by the common components $χit$ and that spanned by the factors $Ft$ coincide. However, the factors $Ft$ and the corresponding loadings are identified only up to a linear transformation. For, if $Ft$ is a vector of factors and $xit=λiFt$, where $λi=(λi·λir)$, then

$Display mathematics$

where $H$ is any non-singular $r×r$ matrix. As a consequence, if it is convenient, it can be assumed that the factors are orthonormal, that is, mutually orthogonal and unit-variance.

### Dynamics

The factor model (7) can accommodate several interesting dynamic structures. For example, suppose that all the x’s are driven by a common cyclical macroeconomic variable $Ct$ in the following way:

$Display mathematics$(8)

and assume that

$Display mathematics$

where $L$ is the lag operator and $ut$ is a scalar white noise. Here each variable $xit$ loads $Ct$ and $Ct–1$ with individual weights, so that, for example, some of the variables have $βi=0$ and are therefore coincident with the cycle $Ct$ and others have $αi=0$ and are therefore lagging. Setting $F1t=Ct$ and $F2t=Ct–1$,

$Display mathematics$

which is a representation of the form (7).

More in general, it can be assumed that the common macroeconomic variable is a vector

$Display mathematics$

where $ut$ is a q-dimensional white-noise vector and $M(L)$ is a square-summable $q×q$ matrix (infinite) polynomial, and that

$Display mathematics$

where $αi$ and $βi$ are $q×1$. Here it is the macroeconomic vector $ft$ that is loaded dynamically by the variables $xit$. Setting $Ft=(f′tf′t−1)′$, representation (7) is

$Display mathematics$

with $r=2q$.

In both cases, a scalar or a vector cyclical macroeconomic variable, the vector $Ft$ is singular in that it has dimension $2q$ but is driven by a q-dimensional white-noise vector. In the first case:

$Display mathematics$

In the second:

A general representation of the factor model, in which the dynamic structure is explicit, is the following:

$Display mathematics$(9)

where $Ft$ is $r$ dimensional, $ut$ is q-dimensional white noise, and $D(L)$ is an $r×q$ square summable matrix polynomial. Note that representation (9) completes representation (3) with an equation linking $Ft$ to $ut$. In view of (9) the terms static factors for $Ft$ and dynamic or primitive factors for $ut$ are often used.

The argument employed in Observation 4 to show that $Ft$ is not identified applies to $ut$ as well. Thus, with no loss of generality, it is assumed throughout that $ut$ is an orthonormal white-noise vector.

As shown in these examples, the static factors $Ft$ and the static representation $xit=λiFt+ξit$ usually result from a transformation of the primitive dynamic form of the model. As seen in the section “Estimation,” the static representation is a very convenient tool to construct estimators and forecasts for the variables $xit$ (see sections on “Estimation” and “Forecasting”). However, when the model is employed for structural analysis the dynamic factors $ut$ and the dynamic relationship between $ut$ and the variables $xit$ play a prominent role (see “The Factor Model and Structural Macroeconomic Analysis”).

The examples also show that typically $r>q$; that is, the vector $Ft$ is singular. This property has very important consequences, as shown in the example and in Theorem 2.

#### Example 2

Consider again model (8), simplified by assuming $Ct=ut$, so that $F1t=ut$, $F2t=ut–1$, and

$Display mathematics$

This is a singular MA(1) representation. The unusual fact is that the MA matrix (1 L) has a square, stable left inverse of degree 1, namely

$Display mathematics$

so that

$Display mathematics$

which is a AR(1) representation.

This unexpected result, that a singular moving average of order one has a finite-degree autoregressive representation, is just an example of the following general result (see Anderson & Deistler, 2008).

#### Theorem 2

Assume that the r-dimensional vector $Ft$ has the representation

$Display mathematics$

where $ut$ is a q-dimensional white noise and $q, so that the vector $Ft$ is singular. Moreover, assume that the entries of $D(L)$ are rational functions of $L:dij(L)=hij(L)/kij(L)$, where $hij$ and $kij,i=1,…,r,j=1,…,q$, are polynomials Then, for generic values of the coefficients of $hij$ and $kij$, that is, for all values of the coefficients with the exception of a lower-dimensional subset, $Ft$ has an autoregressive representation

$Display mathematics$

where (i) $G(L)$ is an $r×r$, stable, finite-degree polynomial matrix, (ii) the rank of $D(0)$ is $q$. (For a more general definition of genericity, see Anderson & Deistler, 2008.)

Using Theorem 2, under the assumption that the entries of $D(L)$ are rational functions, representation (9) can generically be rewritten as

$Display mathematics$(10)

where $R$ is $r×q$ and $G(L)$ is $r×r$, stable and finite-degree. Therefore, if an estimate of $Ft$ is available, the shocks $ut$ can be estimated by means of a singular VAR for $Ft$.

The matrix $R$ has generically full rank q. As a consequence, generically the vector $ut$ belongs to the space spanned by $Ft–s,s≥0$; that is, by definition, $ut$ is generically fundamental for $Ft$.

#### Example 3

The following elementary example, in which $r=2$ and $q=1$, can illustrate this result:

$Display mathematics$(11)

As the polynomials $1+2L$ and $1+3L$ are not invertible, $ut$ is not fundamental for $F1t$ or for $F2t$. However, multiplying the first equation by 3, the second by 2, and subtracting: $ut=3F1t–2F2t$, so that $ut$ belongs to space spanned by current and past values of the vector $Ft$, that is, $ut$ is fundamental for the vector $Ft$.

This section on the dynamics of the factor model concludes by observing that no special assumption has been imposed on the autocovariance of the idiosyncratic components $ξit$.

### Estimation

The section “General Assumptions and Representation Theorem” discussed estimation of the common components by means of the elementary model (4). In that case the arithmetic average of the observable variables $xit$ converges in mean square to $λFt$ as $n→∞$. Slightly complicating the model by assuming that $xit=λiFt+ξit$, thus allowing the loadings to be variable specific, the average becomes

$Display mathematics$

where $λ¯n=(1/n)∑i=1nλi$. The idiosyncratic average converges to zero in mean square, but an additional assumption is needed to ensure that $λ¯n$ is bounded away from zero as $n→∞$.

Further complications arise if $r=2$:

$Display mathematics$

To recover the two factors two distinct averages are needed. For example,

$Display mathematics$

but an assumption is still needed ensuring that the variance-covariance matrix of $x¯nt1$ and $x¯nt2$ remains bounded away from singularity (its determinant is bounded away from zero) as $n→∞$.

#### Principal Components

In general, in a model with $r$ factors, $r≥1$, $r$ averages are needed:

$Display mathematics$

$j=1,...,r$, such that:

(i)

The variance-covariance matrix of the vector $ynt=(xnt1⋯xntr)′$ is bounded away from singularity as $n→∞$.

(ii)

$limn→∞∑i=1naij,nξit=0$ in mean square.

A very convenient solution of this problem consists in setting $xntj=(μnjx)−1Pnj,tx,j=1,…,r,$ where $Pnj,tx$ is the j-th principal component of the vector $xnt$. Precisely,

$Display mathematics$

where $pnjx$ is a row eigenvector of $Γn0x$ corresponding to the j-th eigenvalue, such that $∑i=1n(pnj,ix)2=1$

#### Observation 5

In Lippi (2018) both eigenvectors of the spectral density of the vector $xnt$, are considered, that is, dynamic eigenvectors and eigenvectors of the variance-covariance matrix of $xnt$ (i.e., static eigenvectors). In that article, to avoid confusion the static eigenvectors are denoted as $pnjS,x$. As no confusion can arise here, $pnjx$. is used.□

Regarding (i), as is well known, the variance-covariance matrix of the principal-component vector $(Pn1,tx⋯Pnr,tx)′$ is the $r×r$ diagonal matrix whose entry $(k,k)$ is $μnkx$. As a consequence, the variance-covariance matrix of

$Display mathematics$

is the identity matrix $Ir$ and is therefore bounded away from singularity. Using $μnjx→∞$ as $n→∞$ for $j=1,...,r$, and $μn1ξ≤W$ for all $n$, it is easy to show (ii), that is, that

$Display mathematics$

for $j=1,…,r$.

In conclusion, the first $r$ normalized principal components are an orthonormal basis “converging” in mean square to the space spanned by the factors $Fjt,j=1,...,r$. (the average of the idiosyncratic components by means of each of the normalized eigenvectors tends to zero in mean square). This is the rationale for expressions like “the normalized principal components converge to the factors.” Though suggestive, such statements should be taken with extreme care. Indeed, as seen in Observation 4, any basis in the factor space provides a vector of static factors.

Principal components, as presented earlier, are unfeasible as estimators of the factors for two reasons. First, an estimate of $Γn0x$, call it $Γ^n0x$ is observed, which depends on the observed $(n,T)$ sample of the processes $xit$. Second, the number $r$ of factors is not known.

The number of factors is preliminary to estimation of the principal components. The literature on the determination of $r$ starts with Bai and Ng (2002), who use an information criterion with the variance explained by the factors on one hand, a penalty for adding factors on the other. They show consistency of the criterion as $n$ and $T$ tend to infinity. Improvement and alternatives to Bai and Ng’s criterion are Onatski (2010), Alessi, Barigozzi, and Capasso (2010), and Ahn and Horenstein (2013). A test for the number of factors is proposed in Onatski (2009).

Regarding the number of dynamic factors $q$, consistent criteria are given in Hallin and Liška (2007), Amengual and Watson (2007), and Bai and Ng (2007).

Once $r$ has been determined, using the estimated principal components of the x’s (by means of the eigenvectors of $Γ^n0x$), estimates of the factors and the loadings are obtained. Standard consistency, that is, convergence in probability, as $n$ and $T$ tend to infinity, is proven in Stock and Watson (2002a, 2002b) and Bai and Ng (2002). In particular, Bai and Ng obtain the rate of convergence $max(T–1,n–1)$ for the squared distance between the estimated factors and the factor space.

Once the factors have been estimated and $q$ has been determined, estimation of the last equation in (10) provides an estimate of the dynamic shocks $ut$ and $G(L)$, thus completing the estimation of the high-dimensional dynamic factor model.

Doz, Giannone, and Reichlin (2011) construct a consistent Gaussian quasi-maximum likelihood estimator as an alternative to principal components. Given $n$, they estimate a factor model by maximum likelihood under the assumption that the idiosyncratic components are mutually orthogonal. They show that the bias resulting by “forcing” this orthogonality into the finite-n model becomes negligible as $n$ tends to infinity.

### Forecasting

Suppose that one wants to forecast the variable $x1t$. If the factors and the loadings were known then separately forecasting the factors $Ft$ and the idiosyncratic component $ξ1t$, using past values of $Ft$ and $ξ1t$ respectively, produces a forecast of $x1t$ which outperforms forecasting $x1t$ by its past values. Equivalently, the orthogonal projection of $x1,t+h,h>0$, on $Ft–s,ξ1,t–s,s≥0$, outperforms the projection on $x1,t–s$, or, lastly, the projection

$Display mathematics$(12)

outperforms the projection on $x1,t–s,s≥0$. Again, projection (12) is unfeasible and is replaced in empirical situations by

$Display mathematics$

where $α^(L)$ is a finite-degree $1×r$ matrix polynomial and $β^(L)$ is a finite-degree polynomial. For this specification, see Stock and Watson (2002a).

The literature on forecasting with factor models is vast and cannot be reviewed here. Many papers have produced forecast comparisons including univariate models and different specifications of the dynamic factor model. Different datasets have been used for the United States and Europe in particular. A recent paper reviewing this literature is Stock and Watson (2016).

Regarding the method described in this article versus its “competitor,” presented in Lippi (2018), see Forni, Hallin, Lippi, and Zaffaroni (2017), containing results based on simulated data, and Forni, Giovannelli, Lippi, and Soccorsi (2018), based on the U.S. monthly macroeconomic data set known as the Stock and Watson data set. Both comparisons are illustrated in detail in Lippi (2018); see the section “The Static and Dynamic Methods Compared: Results Based on Simulated and Empirical Data.”

### The Factor Model and Structural Macroeconomic Analysis

The high-dimensional dynamic factor models have been mainly developed with the purpose of analyzing and forecasting macroeconomic data sets. In this context, the idiosyncratic components are interpreted as causes of variation of the x’s that are specific to one or just a few variables, like regional or sectoral shocks, plus measurement errors. In particular, for the big aggregates like income, consumption, and investment, in which all local or sectoral shocks have been averaged out, the variable $ξit$ can be interpreted as only containing measurement error. On the other hand, the common factors $Ft$ and $ut$, being pervasive, that is, affecting all the variables $xit$, can be interpreted as macroeconomic causes of variation.

This interpretation of common and idiosyncratic components of macroeconomic variables has been used to argue that dynamic factor models offer an alternative with important advantages to structural VAR analysis. For heuristic purposes a very simple model is considered here, assuming that the variables $χ1t$ and $χ2t$ are the rate of growth of the GDP and aggregate consumption, respectively, and are driven by a demand shock $v1t$ and a supply shock $v2t$:

$Display mathematics$(13)

where $B2(L)$ is a $2×2$ matrix of rational functions of $L$. Moreover, we assume that the supply shock has no contemporaneous effect on Consumption, so that $b2,22(0)=0$, that is, the (2, 2) entry of $B2(L)$ vanishes for $L=0$.

The variables $χit$, $i=1,2$, are observed with a measurement error: $xit=χit+ξit,i=1,2$. Moreover, it is supposed that $x1t$ and $x2t$ are the first two variables of a large data set with $n$ variables, generated by a dynamic factor model with $q=2$, $r>q$ and that all the common components are driven by $vt$. This means $xnt=χnt+ξnt$,

$Display mathematics$(14)

where $Bn(L)$ is an $n×2$ matrix of rational functions in $L$ whose upper $2×2$ matrix is $B2(L)$. The variables $xit$ can also be represented as in (10). Using the second and third equation,

$Display mathematics$(15)

where $Λ=(λ1′⋯λ′n)′$.

The shocks $v1t$ and $v2t$ in representation (14) have an economic interpretation as demand and supply shocks, while the entries of $Bn(L)$ are the corresponding impulse-response functions. On the other hand, representation (15) is “estimated” starting with the determination of $r$ and q. However, $Bn(L)$ and $Cn(L)$ are singular, so that by Theorem 2, one can safely assume that both are fundamental. This implies that there exists a $2×2$ orthogonal matrix $K$ such that $ut=Kvt$ (see, e.g., Forni, Giannone, Lippi, & Reichlin, 2009, Proposition 2), so that

$Display mathematics$

Because $vt$ is orthonormal white noise, $Bn(L)=Cn(L)K$, so that the (2, 2) entry of $Cn(L)K$ vanishes. Equivalently, determining $K$ such that the (2, 2) entry of $Cn(L)K$ vanishes, the structural impulse response functions $Bn(L)$, $B2(L)$ in particular are obtained.

In conclusion, estimating the common components of the x’s and imposing the identifying restriction on the (2, 2) entry results in an estimate of the structural model (13).

On the other hand, SVAR analysis starts with the estimation of an autoregression

$Display mathematics$

Then the residual vector $ϵt=(ϵ1tϵ2t)′$ is transformed into a vector $wt$ such that $ϵt=Uwt$, where $U$ is a non-singular matrix, such that

(i)

$wt$ is orthonormal,

(ii)

the impulse response functions $A2(L)–1U=E2(L)$, such that

$Display mathematics$(16)

fulfill restrictions suggested by economic logic. In this case, assuming knowledge of the underlying model (13), the restriction $e22(0)=0$ will be imposed. As is well known, this restriction just-identifies the matrix $U$ and therefore the matrix $E2(L)$.

Next is a comparison of the results obtained with the SVAR, that is (16), with those obtained with the dynamic factor model, that is (13):

1.

First, both the estimated matrix $E2(L)$ and the shocks $wt$ are “contaminated” by the measurement errors $ξit,i=1,2$.

2.

Assuming that the the errors $ξit,i=1,2$, are small, one might be led to believe that $D(L)$ and $wt$ are a good and easily estimated approximation of $B2(L)$ and $vt$, respectively. To see that this is not necessarily the case, observe that the matrix $E2(L)$ is by definition invertible, so that representation (16) is fundamental. On the other hand, Theorem 2 ensures that representation (14) is fundamental but says nothing about representation (13).

3.

This is easily understood using model (11), where the scalar shock $ut$ is fundamental for the two-dimensional vector $Ft$ but nonfundamental for $F1t$ or $F2t$. Even assuming that $F1t$ is observed without error, if one estimates an AR for $F1t$, $a(L)F1t=u˜t$, the resulting MA representation $F1t=a(L)–1u˜t$is not an approximation of the “structural” representation $F1t=(1+2L)ut$.

In general, the difficulty with SVAR analysis is that the “structural” moving average $xt=E(L)wt$ results by inverting the estimated autoregressive matrix $A(L)$ and multiplying $A(L)–1$ by a non-singular matrix: $E(L)=A(L)–1U$. Thus $E(L)$ is obviously invertible, $E(L)–1=U–1A(L)$; that is, $wt$ is by definition fundamental for $xt$.

However, fundamentalness, which has a motivation in Theorem 2 for singular representations like (13), has no motivation in SVAR analysis, which is based on non-singular vectors. This point was made several years ago (see Hansen & Sargent, 1980; Lippi & Reichlin, 1993, 1994) and is now a topic for a growing literature (see, e.g., Alessi, Barigozzi, & Capasso, 2011; Fernández-Villaverde, Rubio-Ramírez, Sargent, & Watson, 2007). The following simple model may help the reader to grasp the point of fundamentalness in SVAR analysis.

#### Example 4

Suppose that $xt$ is the rate of change of aggregate productivity and that $xt$ evolves according to the following equation:

$Display mathematics$(17)

where $ut$ is the shock to technology, $pi>0,i=1,2$ and $p1+p2=1$. Thus $ut$ determines the increase of productivity $p1ut$ at time $t$ and the increase $p2ut$ at time $t+1$. Equation (17) has the interpretation of a learning-by-doing process taking two time periods to complete. Rewriting (17) as

$Display mathematics$

where $Vt=p1vt$ and $P=p2/p1$, one has a standard MA(1). If $P<1$ the MA(1) is invertible and estimating an autoregression for $xt$ would produce an approximation of the structural shock $Vt$. On the other hand, if $P>1$ an autoregression would not produce an approximation of $Vt$. In the first case $Vt$ is fundamental for $xt$, and in the second case $Vt$ is nonfundamental. On the other hand, the econometrician observes the autocovariances of $xt$, so that he or she only knows that $xt$ is an MA(1). Therefore, either additional information is available on $p1$ and $p2$, that is on $P$, or the model is not identified. Moreover, the usual practice of estimating autoregressions produces an approximation of the structural shock $Vt$ only if the learning-by-doing process has $p1>p2$.□

In conclusion, under the assumption that the idiosyncratic components of the large macroeconomic variables are measurement errors, the dynamic factor model produces estimates of the “true” underlying variables $χit$, of the structural shocks and impulse-response functions. In particular, singularity of the vector $χt$ and Theorem 2 imply that the shocks estimated from representation (10) differ from the structural shocks by a non-singular matrix.

For the comparison of SVAR analysis and structural analysis based on high-dimensional dynamic factor models, see Stock and Watson (2005) and Forni et al. (2009).

### Non-Stationary Processes

As observed in the section “General Assumptions and Representation Theorem,” almost all the observed variables $xit$ in macroeconomic datasets are non-stationary. Assume for simplicity that all the variables $xit$ are $I(1)$, that the factors are $I(1)$ and the idiosyncratic components are $I(0)$. In this case the factors $Ft$ can be directly estimated as the principal components of the $I(1)$ variables $xit$ (Bai, 2004; Peña & Poncela, 2006). If the idiosyncratic components are $I(1)$ or $I(0)$ the principal components of the stationary series $(1–L)xit$ provide an estimate of the differenced factors $(1–L)Ft$ and of the loadings The factors $Ft$ can then be recovered by integration (see, e.g., Bai & Ng, 2004). In any case, the common practice consisting in taking principal components of the variables $(1–L)xit$ provides consistent estimates of the factors and the loadings.

Thus, as far as estimating the factors and the loadings is concerned, the practice consisting in estimating the dynamic factor model using the differences of the variables $xit$ is correct. A difficulty with respect to the stationary case arises with the estimation of $ut$ and of the impulse-response functions of the variables $xit$ with respect to $ut$ (or structural shocks obtained by a linear transformation of $ut$). This requires the estimation of an autoregressive model for the $I(1)$ factors $Ft$. The latter, under the assumption of singularity, are obviously cointegrated with cointegration rank at least $r–q$. To see this, consider representation (9). As $Ft$ is $I(1)$ The moving average for $Ft$ becomes

$Display mathematics$

The r-dimensional vector $Ft$ is cointegrated if the equation

$Display mathematics$

where c is an r-dimensional row vector and has nontrivial solutions. But $D(1)$ is $r×q$, so that its rank is less or equal to $q$; that is, the cointegration rank of $Ft$ is at least $r–q$.

Cointegration of $Ft$ implies that a VAR in the differences $(1–L)Ft$ is misspecified. Rather, the VAR should be estimated either in the levels or specified as an error correction mechanism. On cointegration of singular vectors and related error correction mechanisms, see Barigozzi, Lippi, and Luciani (2016a) and Banerjee, Marcellino, and Masten (2014, 2017). On estimation of the high-dimensional dynamic factor models with I(1) factors and I(1) idiosyncratic components, see Barigozzi, Lippi, and Luciani (2016b).

### Conclusions and Open Problems for Future Research

Forecasting with high-dimensional dynamic factor models has been developed in a vast literature, with several variants and refinements of the standard method based on principal components. Recent papers comparing different proposals seem to show that great improvements are unlikely within the present approach, that is, by and large, principal-components methods applied to data sets containing between 100 and 300 series; see the papers cited in “Estimation.” Rather, it is possible that data sets of the big-data size may help to produce indicators leading to substantial improvements in the predictive power of dynamic factor models. For an exploration of possible use of big data in macroeconomic analysis, see Ng (2017).

Regarding the application of dynamic factor models to macroeconomic analysis, important papers are Bernanke and Boivin (2003), Bernanke, Boivin, and Eliasz (2005), and Boivin, Giannoni, and Mihov (2009). In Forni and Gambetti (2010) a data set containing 112 U.S. monthly macroeconomic series is used to study the effects of monetary policy on the common components of some key variables, that is, on the variables obtained by removing measurement errors; see “The Factor Model and Structural Macroeconomic Analysis.” Monetary policy shocks are identified using a standard recursive scheme, in which the first impact on both industrial production and prices is zero. They find that some puzzling results obtained using SVAR methods disappear when the observed variables are replaced by their common components. First, the maximal effect on the common component of bilateral real exchange rates is observed on impact, so that the ”delayed overshooting” puzzle disappears. Second, after a contractionary shock, prices fall at all horizons, so that the price puzzle does not occur. Finally, monetary policy has a sizable effect on both real and nominal variables.

Systematic application of factor-model techniques to macroeconomic analysis, as described in the sections “The Factor Model and Structural Macroeconomic Analysis” and “Non-Stationary Processes,” which has been insufficiently explored so far, is an extremely promising research field.

Some papers cited in this article and Lippi (2018) are selected here as basic contributions on dynamic factor models.

For the idea of latent dynamic factors in macroeconomic time series, see Sargent and Sims (1977). Chamberlain and Rothschild (1983) and Chamberlain (1983) introduce the definitions of common and idiosyncratic components for a large cross-section of stochastic variables. They also establish the relationship between principal components and factor analysis.

Forni, Hallin, Lippi, and Reichlin (2000), Forni and Lippi (2001), Stock and Watson (2002a, 2002b), and Bai and Ng (2002) extend Chamberlain and Rothschild’s (1983) contributions to a time-series context, formalizing the large-dimensional dynamic factor model.

The determination of the number of static and dynamic factors was first studied in Bai and Ng (2002) and Hallin and Liška (2007), respectively.

Regarding forecasting with dynamic factor models, see Stock and Watson (2016), Forni et al. (2017), Forni et al. (2018), and Lippi (2018).

Stock and Watson (2005) and Forni et al. (2009) argue that dynamic factor models can be used for structural analysis as an alternative to SVAR models.

Lastly, regarding the fundamentalness problem in empirical macroeconomics, see Alessi et al. (2011). For fundamentalness and singular vectors, see Anderson and Deistler (2008).

#### References

• Ahn, S. C., & Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81(3),1203–1227.
• Alessi, L., Barigozzi, M., & Capasso, M. (2010). Improved penalization for determining the number of factors in approximate static factor models. Statistics and Probability Letters, 80, 1806–1813.
• Alessi, L., Barigozzi, M., & Capasso, M. (2011). Nonfundamentalness in structural econometric models: A review. International Statistical Review, 79, 16–47.
• Amengual, D., & Watson, M. W. (2007). Consistent estimation of the number of dynamic factors in a large N and T panel. Journal of Business & Economic Statistics, 25(1), 91–96.
• Anderson, B. D. O., & Deistler, M. (2008). Properties of zero-free transfer function matrices. SICE Journal of Control, Measurement and System Integration, 1(4), 284–292.
• Bai, J. (2004). Estimating cross-section common stochastic trends in nonstationary panel data. Journal of Econometrics, 122, 137–183.
• Bai, J., & Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70(1), 191–221.
• Bai, J., & Ng, S. (2004). A PANIC attack on unit roots and cointegration. Econometrica, 72, 1127–1177.
• Bai, J., & Ng, S. (2007). Determining the number of primitive shocks in factor models. Journal of Business & Economic Statistics, 25, 52–60.
• Banerjee, A., Marcellino, M., & Masten, I. (2014). Forecasting with factor augmented error correction models. International Journal of Forecasting, 30, 589–612.
• Banerjee, A., Marcellino, M., & Masten, I. (2017). Structural FECM: Cointegration in large-scale structural FAVAR models. Journal of Applied Econometrics, 32, 1069–1086.
• Barigozzi, M., Lippi, M., & Luciani, M. (2016a). Dynamic factor models, cointegration, and error correction mechanisms. arXiv.
• Barigozzi, M., Lippi, M., & Luciani, M. (2016b). Non-stationary dynamic factor models for large datasets. Finance and Economics Discussion Series 2016–024. Washington, DC: Board of Governors of the Federal Reserve System.
• Bernanke, B. S., & Boivin, J. (2003). Monetary policy in a data-rich environment. Journal of Monetary Economics, 50(3), 525–546.
• Bernanke, B. S., Boivin, J., & Eliasz, P. S. (2005). Measuring the effects of monetary policy: A factor-augmented vector autoregressive (FAVAR) approach. The Quarterly Journal of Economics, 120, 387–422.
• Boivin, J., Giannoni, M. P., & Mihov, I. (2009). Sticky prices and monetary policy: Evidence from disaggregated US data. American Economic Review, 99(1), 350–384.
• Burns, A. F., & Mitchell, W. C. (1946). Measuring business cycles. Cambridge, MA: National Bureau of Economic Research.
• Chamberlain, G. (1983). Funds, factors and diversification in arbitrage pricing models. Econometrica, 51(5), 1281–1304.
• Chamberlain, G., & Rothschild, M. (1983). Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica, 51(5), 1305–1324.
• Doz, C., Giannone, D., & Reichlin, L. (2011). A quasi–maximum likelihood approach for large, approximate dynamic factor models. Review of Economics and Statistics, 94(4), 1014–1024.
• Fernández-Villaverde, J., Rubio-Ramírez, J. F., Sargent, T. J., & Watson, M. W. (2007). ABCs (and Ds) of understanding VARs. American Economic Review, 97(3),1021–1026.
• Forni, M., & Gambetti, L. (2010). The dynamic effects of monetary policy: A structural factor model approach. Journal of Monetary Economics, 57, 203–216.
• Forni, M., Giannone, D., Lippi, M., & Reichlin, L. (2009). Opening the black box: Structural factor models with large cross sections. Econometric Theory, 25(5), 1319–1347.
• Forni, M., Giovannelli, A., Lippi, M., & Soccorsi, S. (2018). Dynamic factor model with infinite-dimensional factor space: Forecasting. Journal of Applied Econometrics, 33(5), 625–642.
• Forni, M., Hallin, M., Lippi, M., & Reichlin, L. (2000). The generalized dynamic-factor model: Identification and estimation. The Review of Economics and Statistics, 82(4), 540–554.
• Forni, M., Hallin, M., Lippi, M., & Zaffaroni, P. (2017). Dynamic factor models with infinite-dimensional factor spaces: Asymptotic analysis. Journal of Econometrics, 199(1), 72–92.
• Forni, M., & Lippi, M. (1997). Aggregation and the microfoundations of dynamic macroeconomics. New York, NY: Oxford University Press.
• Forni, M., & Lippi, M. (2001). The generalized dynamic factor model: Representation theory. Econometric Theory, 17(6), 1113–1141.
• Forni, M., & Reichlin, L. (1996). Dynamic common factors in large cross-sections. Empirical Economics, 21(1), 27–42.
• Forni, M., & Reichlin, L. (1998). Let’s get real: A factor analytical approach to disaggregated business cycle dynamics. Review of Economic Studies, 65(3), 453–473.
• Hallin, M., & Lippi, M. (2013). Factor models in high-dimensional time series–A timedomain approach. Stochastic Processes and Their Applications, 123(7), 2678–2695.
• Hallin, M., & Liška, R. (2007). Determining the number of factors in the general dynamic factor model. Journal of the American Statistical Association, 102(478), 603–617.
• Hansen, L. P., & Sargent, T. J. (1980). Formulating and estimating dynamic linear rational expectations models. Journal of Economic Dynamics and Control, 2(1), 7–46.
• Lippi, M. (2018). Frequency-domain approach in high-dimensional dynamic factor models. In Oxford research encyclopedia of economics and finance. New York, NY: Oxford University Press.
• Lippi, M., & Reichlin, L. (1993). The dynamic effects of aggregate demand and supply disturbances: Comment. American Economic Review, 83(3), 644–652.
• Lippi, M., & Reichlin, L. (1994). VAR analysis, nonfundamental representations, blaschke matrices. Journal of Econometrics, 63(1), 307–325.
• Ng, S. (2017). Opportunities and challenges: Lessons from analyzing terabytes of scanner data. NBER Working Paper 23673. Cambridge, MA: National Bureau of Economic Research.
• Onatski, A. (2009). Testing hypotheses about the number of factors in large factor models. Econometrica, 77(5), 1447–1479.
• Onatski, A. (2010). Determining the number of factors from empirical distribution of eigenvalues. The Review of Economics and Statistics, 92(4), 1004–1016.
• Peña, D., & Poncela, P. (2006). Nonstationary dynamic factor analysis. Journal of Statistical Planning and Inference, 136, 1237–1257.
• Sargent, T. J., & Sims, C. A. (1977). Business cycle modeling without pretending to have too much a priori economic theory. In C. A. Sims (Ed.), New methods in business cycle research. Minneapolis, MN: Federal Reserve Bank of Minneapolis.
• Stock, J. H., & Watson, M. W. (1998). Diffusion indexes. Working Paper. Cambridge, MA: National Bureau of Economic Research.
• Stock, J. H., & Watson, M. W. (2002a). Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association, 97(460), 1167–1179.
• Stock, J. H., & Watson, M. W. (2002b). Macroeconomic forecasting using diffusion indexes. Journal of Business & Economic Statistics, 20(2), 147–162.
• Stock, J. H., & Watson, M. W. (2005). Implications of dynamic factor models for VAR analysis. NBER Working Paper 11467. Cambridge, MA: National Bureau of Economic Research.
• Stock, J., & Watson, M. (2016). Dynamic factor models, factor-augmented vector autoregressions, and structural vector autoregressions in macroeconomics. In J. B. Taylor & H. Uhlig (Eds.), Handbook of macroeconomics (Vol. 2, pp. 415–525). Amsterdam, The Netherlands: Elsevier.