High-Dimensional Dynamic Factor Models have their origin in macroeconomics, precisely in empirical research on Business Cycles. The central idea, going back to the work of Burns and Mitchell in the years 1940, is that the fluctuations of all the macro and sectoral variables in the economy are driven by a “reference cycle,” that is, a one-dimensional latent cause of variation. After a fairly long process of generalization and formalization, the literature settled at the beginning of the year 2000 on a model in which (1) both
n
the number of variables in the dataset and
T
, the number of observations for each variable, may be large, and (2) all the variables in the dataset depend dynamically on a fixed independent of
n
, a number of “common factors,” plus variable-specific, usually called “idiosyncratic,” components. The structure of the model can be exemplified as follows:
x
i
t
=
α
i
u
t
+
β
i
u
t
−
1
+
ξ
i
t
,
i
=
1,
…
,
n
,
t
=
1,
…
,
T
,
(*)
where the observable variables
x
i
t
are driven by the white noise
u
t
, which is common to all the variables, the common factor, and by the idiosyncratic component
ξ
i
t
. The common factor
u
t
is orthogonal to the idiosyncratic components
ξ
i
t
, the idiosyncratic components are mutually orthogonal (or weakly correlated). Lastly, the variations of the common factor
u
t
affect the variable
x
i
t
dynamically, that is through the lag polynomial
α
i
+
β
i
L
. Asymptotic results for High-Dimensional Factor Models, particularly consistency of estimators of the common factors, are obtained for both
n
and
T
tending to infinity.
Model
(
∗
)
, generalized to allow for more than one common factor and a rich dynamic loading of the factors, has been studied in a fairly vast literature, with many applications based on macroeconomic datasets: (a) forecasting of inflation, industrial production, and unemployment; (b) structural macroeconomic analysis; and (c) construction of indicators of the Business Cycle. This literature can be broadly classified as belonging to the time- or the frequency-domain approach. The works based on the second are the subject of the present chapter.
We start with a brief description of early work on Dynamic Factor Models. Formal definitions and the main Representation Theorem follow. The latter determines the number of common factors in the model by means of the spectral density matrix of the vector
(
x
1
t
x
2
t
⋯
x
n
t
)
. Dynamic principal components, based on the spectral density of the
x
’s, are then used to construct estimators of the common factors.
These results, obtained in early 2000, are compared to the literature based on the time-domain approach, in which the covariance matrix of the
x
’s and its (static) principal components are used instead of the spectral density and dynamic principal components. Dynamic principal components produce two-sided estimators, which are good within the sample but unfit for forecasting. The estimators based on the time-domain approach are simple and one-sided. However, they require the restriction of finite dimension for the space spanned by the factors.
Recent papers have constructed one-sided estimators based on the frequency-domain method for the unrestricted model. These results exploit results on stochastic processes of dimension
n
that are driven by a
q
-dimensional white noise, with
q
<
n
, that is, singular vector stochastic processes. The main features of this literature are described with some detail.
Lastly, we report and comment the results of an empirical paper, the last in a long list, comparing predictions obtained with time- and frequency-domain methods. The paper uses a large monthly U.S. dataset including the Great Moderation and the Great Recession.
Article
High-dimensional dynamic factor models have their origin in macroeconomics, more specifically in empirical research on business cycles. The central idea, going back to the work of Burns and Mitchell in the 1940s, is that the fluctuations of all the macro and sectoral variables in the economy are driven by a “reference cycle,” that is, a one-dimensional latent cause of variation. After a fairly long process of generalization and formalization, the literature settled at the beginning of the 2000s on a model in which (a) both n, the number of variables in the data set, and T, the number of observations for each variable, may be large; (b) all the variables in the data set depend dynamically on a fixed, independent of n, number of common shocks, plus variable-specific, usually called idiosyncratic, components. The structure of the model can be exemplified as follows:
(*)
x
i
t
=
α
i
u
t
+
β
i
u
t
−
1
+
ξ
i
t
,
i
=
1
,
…
,
n
,
t
=
1
,
…
,
T
,
where the observable variables
x
i
t
are driven by the white noise
u
t
, which is common to all the variables, the common shock, and by the idiosyncratic component
ξ
i
t
. The common shock
u
t
is orthogonal to the idiosyncratic components
ξ
i
t
, the idiosyncratic components are mutually orthogonal (or weakly correlated). Last, the variations of the common shock
u
t
affect the variable
x
i
t
dynamically, that is, through the lag polynomial
α
i
+
β
i
L
. Asymptotic results for high-dimensional factor models, consistency of estimators of the common shocks in particular, are obtained for both
n
and
T
tending to infinity.
The time-domain approach to these factor models is based on the transformation of dynamic equations into static representations. For example, equation (
∗
) becomes
x
i
t
=
α
i
F
1
t
+
β
i
F
2
t
+
ξ
i
t
,
F
1
t
=
u
t
,
F
2
t
=
u
t
−
1
.
Instead of the dynamic equation (
∗
) there is now a static equation, while instead of the white noise
u
t
there are now two factors, also called static factors, which are dynamically linked:
F
1
t
=
u
t
,
F
2
t
=
F
1,
t
−
1
.
This transformation into a static representation, whose general form is
x
i
t
=
λ
i
1
F
1
t
+
⋯
+
λ
i
r
F
r
t
+
ξ
i
t
,
is extremely convenient for estimation and forecasting of high-dimensional dynamic factor models. In particular, the factors
F
j
t
and the loadings
λ
i
j
can be consistently estimated from the principal components of the observable variables
x
i
t
.
Assumption allowing consistent estimation of the factors and loadings are discussed in detail. Moreover, it is argued that in general the vector of the factors is singular; that is, it is driven by a number of shocks smaller than its dimension. This fact has very important consequences. In particular, singularity implies that the fundamentalness problem, which is hard to solve in structural vector autoregressive (VAR) analysis of macroeconomic aggregates, disappears when the latter are studied as part of a high-dimensional dynamic factor model.
Article
Aris Spanos
The current discontent with the dominant macroeconomic theory paradigm, known as Dynamic Stochastic General Equilibrium (DSGE) models, calls for an appraisal of the methods and strategies employed in studying and modeling macroeconomic phenomena using aggregate time series data. The appraisal pertains to the effectiveness of these methods and strategies in accomplishing the primary objective of empirical modeling: to learn from data about phenomena of interest. The co-occurring developments in macroeconomics and econometrics since the 1930s provides the backdrop for the appraisal with the Keynes vs. Tinbergen controversy at center stage. The overall appraisal is that the DSGE paradigm gives rise to estimated structural models that are both statistically and substantively misspecified, yielding untrustworthy evidence that contribute very little, if anything, to real learning from data about macroeconomic phenomena. A primary contributor to the untrustworthiness of evidence is the traditional econometric perspective of viewing empirical modeling as curve-fitting (structural models), guided by impromptu error term assumptions, and evaluated on goodness-of-fit grounds. Regrettably, excellent fit is neither necessary nor sufficient for the reliability of inference and the trustworthiness of the ensuing evidence. Recommendations on how to improve the trustworthiness of empirical evidence revolve around a broader model-based (non-curve-fitting) modeling framework, that attributes cardinal roles to both theory and data without undermining the credibleness of either source of information. Two crucial distinctions hold the key to securing the trusworthiness of evidence. The first distinguishes between modeling (specification, misspeification testing, respecification, and inference), and the second between a substantive (structural) and a statistical model (the probabilistic assumptions imposed on the particular data). This enables one to establish statistical adequacy (the validity of these assumptions) before relating it to the structural model and posing questions of interest to the data. The greatest enemy of learning from data about macroeconomic phenomena is not the absence of an alternative and more coherent empirical modeling framework, but the illusion that foisting highly formal structural models on the data can give rise to such learning just because their construction and curve-fitting rely on seemingly sophisticated tools. Regrettably, applying sophisticated tools to a statistically and substantively misspecified DSGE model does nothing to restore the trustworthiness of the evidence stemming from it.