1-2 of 2 Results

  • Keywords: machine learning x
Clear all


Machine Learning in Policy Evaluation: New Tools for Causal Inference  

Noémi Kreif and Karla DiazOrdaz

While machine learning (ML) methods have received a lot of attention in recent years, these methods are primarily for prediction. Empirical researchers conducting policy evaluations are, on the other hand, preoccupied with causal problems, trying to answer counterfactual questions: what would have happened in the absence of a policy? Because these counterfactuals can never be directly observed (described as the “fundamental problem of causal inference”) prediction tools from the ML literature cannot be readily used for causal inference. In the last decade, major innovations have taken place incorporating supervised ML tools into estimators for causal parameters such as the average treatment effect (ATE). This holds the promise of attenuating model misspecification issues, and increasing of transparency in model selection. One particularly mature strand of the literature include approaches that incorporate supervised ML approaches in the estimation of the ATE of a binary treatment, under the unconfoundedness and positivity assumptions (also known as exchangeability and overlap assumptions). This article begins by reviewing popular supervised machine learning algorithms, including trees-based methods and the lasso, as well as ensembles, with a focus on the Super Learner. Then, some specific uses of machine learning for treatment effect estimation are introduced and illustrated, namely (1) to create balance among treated and control groups, (2) to estimate so-called nuisance models (e.g., the propensity score, or conditional expectations of the outcome) in semi-parametric estimators that target causal parameters (e.g., targeted maximum likelihood estimation or the double ML estimator), and (3) the use of machine learning for variable selection in situations with a high number of covariates. Since there is no universal best estimator, whether parametric or data-adaptive, it is best practice to incorporate a semi-automated approach than can select the models best supported by the observed data, thus attenuating the reliance on subjective choices.


Asset Pricing: Cross-Section Predictability  

Paolo Zaffaroni and Guofu Zhou

A fundamental question in finance is the study of why different assets have different expected returns, which is intricately linked to the issue of cross-section prediction in the sense of addressing the question “What explains the cross section of expected returns?” There is vast literature on this topic. There are state-of-the-art methods used to forecast the cross section of stock returns with firm characteristics predictors, and the same methods can be applied to other asset classes, such as corporate bonds and foreign exchange rates, and to managed portfolios such mutual and hedge funds. First, there are the traditional ordinary least squares and weighted least squares methods, as well as the recently developed various machine learning approaches such as neutral networks and genetic programming. These are the main methods used today in applications. There are three measures that assess how the various methods perform. The first is the Sharpe ratio of a long–short portfolio that longs the assets with the highest predicted return and shorts those with the lowest. This measure provides the economic value for one method versus another. The second measure is an out-of-sample R 2 that evaluates how the forecasts perform relative to a natural benchmark that is the cross-section mean. This is important as any method that fails to outperform the benchmark is questionable. The third measure is how well the predicted returns explain the realized ones. This provides an overall error assessment cross all the stocks. Factor models are another tool used to understand cross-section predictability. This sheds light on whether the predictability is due to mispricing or risk exposure. There are three ways to consider these models: First, we can consider how to test traditional factor models and estimate the associated risk premia, where the factors are specified ex ante. Second, we can analyze similar problems for latent factor models. Finally, going beyond the traditional setup, we can consider recent studies on asset-specific risks. This analysis provides the framework to understand the economic driving forces of predictability.