Gravity Models and Empirical Trade
Gravity Models and Empirical Trade
 Scott BaierScott BaierCollege of Business, Clemson University
 and Samuel StandaertSamuel StandaertInstitute on Comparative Regional Integration Studies, United Nations University
Summary
The gravity model of international trade states that the volume of trade between two countries is proportional to their economic mass and a measure of their relative trade frictions. Perhaps because of its intuitive appeal, the gravity model has been the workhorse model of international trade for more than 50 years. While the initial empirical work using the gravity model lacked sound theoretical underpinnings, the theoretical developments have highlighted how a gravitylike specification can be derived from many models with varying assumptions about preferences, technology, and market structure. Along the strengthening of the theoretical roots of the gravity model, the way in which it is estimated has also evolved significantly since the start of the new millennium. Depending on the exact characteristics of regression, different estimation methods should be used to estimate the gravity model.
Keywords
Subjects
 International Economics
The Workhorse of International Trade
For more than 50 years, the gravity model has been the workhorse model of empirical international trade. Originally the model was presented as a simple analogy between Newton’s Universal Law of Gravitation and factors that would influence bilateral trade flows. The flow of trade between two countries was posited to be proportional to the economic size of the trading partners and inversely related to their distance from each other. As formulated, the gravity equation of international trade could be rewritten as a loglinear empirical specification that could be easily estimated. A large number of studies showed that the empirical findings were consistent with naïve gravity model. In particular, the coefficient estimates of the elasticity of bilateral trade to importer and exporter GDP were close to unity, the elasticity of trade with respect to bilateral distance was negative; moreover, the empirical specification was able to account for a reasonable amount of the observed variation in trade.
Even though the model was an empirical success, the gravity equation lacked a sound theoretical background. Beginning in the late 1970s, several authors showed that the gravitylike specification would emerge from a variety of standard assumptions regarding preferences, technology, market structure, and trade. At the same time, empirical trade economists became more concerned about the estimation strategy; in particular, that estimation using ordinary least squares might lead to biased coefficient estimates. The purpose of the review is to trace the history of the gravity equation and provide context for the evolution of the gravity equation of international trade. The review also highlights the current state of the field and highlights areas of future research.^{1}
Since much of the work on the gravity equation has been designed to identify factors that may reduce or enhance bilateral trade, the paper starts by using a naïve gravity specification to show how geography, history, culture, and government policies appear to influence trade flows by looking at the crosssection of data for 145 countries in 2014. It goes on to provide an overview of theoretical models and empirical specifications from 1970 through 2001. The subsequent section works through four of the standard models of international trade and shows how each model leads to a similar empirical specification—the structural gravity model. This section also briefly covers how these models can be extended to include tariffs, intermediate goods, and multiple sectors; it concludes by reviewing recent theoretical models that lead to a gravitylike empirical specification.
The final section of the article reviews the state of the empirical specifications. It starts with a discussion of the conditions under which the loglinear gravity model estimated by ordinary least squares will yield consistent estimates of the coefficients of interest. In most cases, however, these conditions are not satisfied, and an alternative estimator is needed. Santos Silva and Tenreyro (2006) showed that the Poisson Pseudo Maximum Likelihood Estimator has desirable properties that make it attractive for the empirical gravity work. These estimators are contrasted with the Gamma Pseudo Maximum Likelihood Estimator, and Nonlinear Least Squares, and different specification tests are discussed that may assist in choosing among them. Another issue that can arise in estimating the gravity model is the endogeneity of the control variables, the typical solutions for which are also briefly discussed. The section concludes with a discussion of the path for future work.
Gravity: A First Look at the Data
The early empirical gravity models of international trade were rooted in a simple and intuitive analogy to Newton’s Law of Universal Gravity. According to Newton’s law, the force of attraction between two bodies is proportional to the product of their masses and inversely proportional to their distance squared. These early gravity models of trade postulated a similar relationship between the bilateral trade flows between two countries, their economic sizes, and a measure of trade frictions. The lack of theoretical underpinnings for this relationship is the reason why it is referred to as the naïve gravity model. Mathematically, it can be expressed as
where ${X}_{ij}$ is bilateral trade between exporting country $i$ and importing country $j$, ${Y}_{i}\phantom{\rule{0.2em}{0ex}}\left({Y}_{j}\right)$ is the gross domestic product in country $i\phantom{\rule{0.2em}{0ex}}\left(j\right)$ and $dis{t}_{ij}\phantom{\rule{0.2em}{0ex}}$ is the bilateral distance between country $i$ and $j$. ${\epsilon}_{ij}$ is typically assumed to be a lognormally distributed error term. Given the multiplicative structure and the assumption on the error term, Equation 1 can be estimated by taking the natural logarithm that leads to a loglinear specification
Literally, hundreds of papers have estimated the gravity equation by ordinary least squares. Intuitively, one would expect the economic size of the countries to have a positive effect (${\beta}_{1}>0$, ${\beta}_{2}>0)$ and distance to have a negative effect (${\beta}_{3}<0$). While the coefficient estimates have varied from study to study depending on the period and the sample of countries, the estimated coefficients of ${\beta}_{1}\phantom{\rule{0.2em}{0ex}}$ and ${\beta}_{2}$ were typically found to be close to unity, while that of ${\beta}_{3}$ was typically negative and statistically significant. What cemented the gravity model’s popularity was its ability to explain much of the observed variation in bilateral trade flows.
The core relationship of gravity models can be easily illustrated using the overall patterns in trade data. In a world without trade frictions, a simple gravity relationship is given by ${X}_{ij}=\frac{{Y}_{i}{Y}_{j}}{{Y}_{W}}$ where, as before, ${Y}_{i}\phantom{\rule{0.2em}{0ex}}\left({Y}_{j}\right)$ is GDP of country $i\phantom{\rule{0.2em}{0ex}}\left(j\right)$ and ${Y}_{W}$ is world income. The frictionless gravity equation can be rearranged to relate country $j\text{'}s$ expenditure share on goods produced in country $i$ ($\frac{{X}_{ij}}{{Y}_{j}})$ to the latter’s share of the world production ($\frac{{Y}_{i}}{{Y}_{W}}$).^{2} Using trade data from 2014 for 125 countries, Figure 1 plots the relationship between expenditure shares and production shares on a logarithmic scale. The high correlation between these variables shown in this graph is consistent with, and part of the appeal of, the empirical structure of the gravity equation.^{3} At the same time, Figure 1 also shows that more than 90% of all expenditure shares remain below the 45degree line. That these import shares fall consistently below the production shares indicates that the world is far from frictionless.
Modifying the frictionless gravity equation gives a crude measure of trade costs. To start, we rewrite the bilateral trade flows as
where ${\varphi}_{ij}$ the bilateral cost of trading and the strictly positive $\u03f5$ is the elasticity of bilateral trade flows with respect to these trade costs. Much of the empirical gravity literature has been devoted to identifying and quantifying the factors that influence trade costs. They can be classified as costs induced by geography (natural trade costs), costs associated with historical and cultural linkages, and costs induced by policies (sometimes referred to as “unnatural trade costs”). Researchers interested in international trade and economic geography have emphasized the role of natural trade costs (often referred to as second nature geography) and how these natural trade costs are associated with the respective location of the economic agents. An obvious empirical measure of such costs is the distance between countries. Limao and Venables (2001) and Hummels (2007) investigated the empirical relationship between observed CIFFOB trade costs—that is, all the costs associated with shipping the goods and insuring it against damage during transport—and distance. These authors found a positive correlation between distance and trade costs. Indeed, one of the most robust findings in the empirical gravity literature is the negative relationship between distance and bilateral trade, or its equivalent: the positive relationship between the natural logarithm of distance and trade costs. The rough measure of trade costs obtained by rewriting the naïve gravity equation as
Figure 2 plots the relationship between this measure of trade costs and distance, depicting their clear positive relationship. The fitted line indicates that trade falls by 1.4% for every 1% increase in distance.
Other geographical factors are also posited to influence trade, including whether the countries share a border. It is frequently argued that contiguous countries have lower trade costs because their common border lowers both pecuniary and nonpecuniary costs of trade. Figure 3 depicts the relationship between trade costs and distance, where contiguous countrypairs are depicted with the plus symbol (+). On average, the points that correspond to the bilateral pairs that share a border lie below the leastsquares line representing the relationship between trade costs and distance, indicating that contiguous countries face lower trade costs.^{4}
In addition to geography, cultural and historical factors are likely to influence trade costs. For example, Figure 4 depicts the relationship between trade costs and distance, this time with bilateral pairs that have ever had a colonial relationship indicated with a diamond symbol. Since these country pairs fall on average below the fitted line, the figure again suggests that trade costs are lower for bilateral pairs that share a colonial history.^{5} Potential explanations for this may be that the colonial history implies more familiarity with each other or more similar institutions. Alternatively, the existence of differences in resources that increase trade between the two countries would have been a factor of colonial relationships in the first place.
Finally, some trade costs are likely attributable to government policies. For example, higher tariffs, economic sanctions, and other forms of regulations likely raise the cost of trading and hence reduce trade. Free trade agreements, on the other hand, are typically designed to lower trade costs and boost trade. Figure 5a depicts the relationship between trade costs and distance where countries that have a free trade agreement are depicted with a gold cross.
However, it is not immediately evident that conditional on distance, countries that have free trade agreements face lower trade costs. A potential explanation may be that other factors associated with trade costs also need to be included. Alternatively, the formation of the trade agreements could be in response to other factors such as high trade costs. Figure 5b depicts the separate fitted lines for the relationship between trade costs and distance with and without trade agreements. These fitted lines indicate that over shorter distances trade costs are lower, on average, for bilateral pairs that have a trade agreement. However, as the distance between the countries increases, free trade agreements appear to have a smaller impact on trade costs.^{6}
While these figures are only suggestive of the relationships between bilateral trade and geography, history, and government policies, the theories described in subsequent sections provide guidance on how other confounding factors can be controlled for when specifying an empirical gravity model.
Early Theoretical Developments and Empirical Applications
The gravity framework initially was appealing to researchers because the loglinear model was a simple and intuitive empirical way to assess the relationship between bilateral trade flows, production, income, and variables that could conceivably be viewed as factors that distort bilateral trade. When applied to trade data, the coefficient estimates were typically economically and statistically significant, and the simple gravity specification seemed to account for a large share of the variation of bilateral trade flows. Even though the gravity equation was considered an empirical success, it was often criticized for lacking sound theoretical foundations. Many of the early attempts to provide a theoretical foundation for the gravity model showed that bilateral trade was a function of incomes but did not provide an explicit rationale for the inclusion of distance and other trade costs. For example, Leamer and Stern (1970) presented a probabilistic model of bilateral trade flows. In their model, it was assumed that each transaction was of the same size $\left(\gamma \right)$ and that the likelihood an exporter in $i$ would meet and trade with an importer in $j$, would depend on the trade capacity of each of the two countries relative to total trade. If trade capacity of the exporting (importing) county $i\phantom{\rule{0.2em}{0ex}}\left(j\right)$ is given by ${F}_{i}\phantom{\rule{0.2em}{0ex}}\left({F}_{j}\right)$, then the probability of trade between an exporter and importer is given by ${p}_{ij}=\frac{{F}_{i}}{{F}_{W}}\frac{{F}_{j}}{{F}_{W}}$, where ${F}_{W}$ represents total world trade. If there are N transactions of size $\gamma $, then total world trade would be given by ${F}_{W}=N\gamma $ and the volume of trade between $i$ and $j$ to be given by
Letting gross domestic product proxy for trade capacity results in the frictionless gravity equation. Leamer and Stern (1970) then asserted that it is plausible to assume that the likelihood of trade between two countries would depend on their proximity to each other so that bilateral trade would be given by
where ${g}^{\prime}\left(dis{t}_{ij}\right)<0$.
Anderson (1979) also derived a simple, frictionless gravity equation. In a world without trade costs and where preferences are characterized by homothetic preferences defined over a distinct basket of goods produced by each country, Anderson showed that the volume of trade between country $i\phantom{\rule{0.2em}{0ex}}$ and country $j$ is given by ${X}_{ij}={\text{\theta}}_{\text{i}}{Y}_{j}\phantom{\rule{0.2em}{0ex}}$ where ${\theta}_{i}$ represents the representative agent’s preferences for the good produced in country $i$. Goods market clearing (i.e., total goods supplied equal total goods demanded) implies that
Substituting for ${\theta}_{i}$ yields the frictionless gravity equation
Even this simple formulation without trade costs provides a few simple, testable hypotheses regarding bilateral trade flows. Helpman (1987) and Baier and Bergstrand (2001) showed that this model predicts that bilateral trade increases as the difference in the economic size of the two countries decreases and when total economic size increases. To see this, define ${s}_{i}=\left(\frac{{Y}_{i}}{{Y}_{i}+{Y}_{j}}\right)$ and ${s}_{ij}=\frac{{Y}_{i}+{Y}_{j}}{{Y}_{W}}$ so that the frictionless gravity equation can be expressed as $\frac{{X}_{ij}}{{Y}_{W}}={s}_{i}{s}_{j\phantom{\rule{0.2em}{0ex}}}{s}_{ij}^{2}$
This equation is linear when log transformed and can be estimated as
One would expect the coefficient on ${\beta}_{1}\phantom{\rule{0.2em}{0ex}}$ to be close to unity and the coefficient on ${\beta}_{2}$ to be close to two. Estimating this model using bilateral data from 2014 for 145 countries, the parameters are as follows
where the standard errors are in parenthesis, ${r}^{2}=0.531$ and $N=17,538$. The coefficient estimates of both $ln\left({s}_{i}{s}_{j}\right)$ and $ln\left({s}_{ij}\right)$ are different from its hypothesized values at the 95% confidence level; however, the coefficient estimate is roughly consistent with the simple, frictionless gravity.
In addition to the model with costless trade, Anderson (1979) presented several models where bilateral trade is influenced by trade costs. The most widely cited of these models is the Armington model. In this model, the representative agent’s preferences are defined over goods, where each good is uniquely produced by one country. These preferences are characterized by a constant elasticity of substitution (CES) utility function given by
The representative agent maximizes her utility subject to a budget constraint given by ${w}_{j}={\displaystyle \sum}_{i=1}^{N}{p}_{ij}{c}_{ij}\phantom{\rule{0.2em}{0ex}}$ where ${w}_{j}$ is the wage rate of the representative agent in country $i$ and ${p}_{ij}$ and ${c}_{ij}$ are respectively the consumption and the landed price of the good produced in $i\phantom{\rule{0.2em}{0ex}}$ and consumed in $\phantom{\rule{0.2em}{0ex}}j$ (i.e. the price paid by consumers in j). Anderson assumed that trade was subject to iceberg trading costs such that if one unit of the good is shipped from country $i$ to country $j$ only $\frac{1}{{t}_{ij}}$ of the good would arrive in country $j$ (${t}_{ij}>1$ for $i\ne j$). As markets are assumed to be competitive, the landed price is simply equal to the factory gate price, ${p}_{i}$, scaled up by the iceberg trading costs so that ${p}_{ij}={p}_{i}{t}_{ij}$.
The Armington assumptions imply that country j’s total expenditures on goods produced in $i\phantom{\rule{0.2em}{0ex}}$ are given by
where ${Y}_{j}$ is aggregate income (${Y}_{j}={w}_{j}{L}_{j}$) and ${P}_{j}={\left({\displaystyle \sum}_{i=1}^{N}{\beta}_{i}{\left({p}_{i}{t}_{ij}\right)}^{1\sigma}\right)}^{\frac{1}{1\sigma}}$. By summing over all destinations, market clearing implies
If the quantity of each country’s good is defined so that its price is equal to unity, the following expression for bilateral trade can be obtained by substituting for ${\beta}_{i}$
In order to estimate equation 4, many authors assumed that the term in brackets exhibited little variation across bilateral trading partners so that it could safely be ignored. Additionally, it was assumed that trade costs could be modeled as ${t}_{ij}^{1\sigma}=\text{exp}\left({z}_{ij}^{N}{\beta}^{N}+{z}_{ij}^{H}{\beta}^{H}+{z}_{ij}^{P}{\beta}^{P}+{e}_{ij}\right)$, where ${z}_{ij}^{N}\phantom{\rule{0.2em}{0ex}}$ is a vector of variables capturing natural trading costs, ${z}_{ij}^{H}$ is a vector of variables associated with history and cultural factors, and ${z}_{ij}^{P}\phantom{\rule{0.2em}{0ex}}$ is a vector of trade costs associated with policy, and ${\u03f5}_{ij}$ is a normally distributed error term. Taking the natural logarithm of Equation 4 yields
Rather than ignoring the term in brackets, some authors approximated the term by computing a “remoteness” index (see, e.g., Wei [1996], Helliwell [1997], and Harrigan [2003]). While the coefficient estimates on the remoteness variables had the expected signs and were statistically significant, the variables were simple reducedform representations that typically were a GDPweighted measure of distance that did not incorporate all aspects of the trade cost vector.
Bergstrand (1985) built on the Anderson framework by including a nested CES demand structure that allowed for the elasticity of substitution among imported goods to differ from the elasticity of substitution between domestically produced goods and imported goods. Unlike Anderson (1979), Bergstrand (1985) assumed that there are costs associated with distributing the products to each potential market and this cost could be modeled with a constant elasticity of transformation (CET) function. Assuming that incomes and prices were exogenous, the CETtechnology yielded a set of export supply equations that can be linked to the CESsystem of demand equations yielding the following gravity equation
where $\nu $ is the elasticity of substitution between domestically produced goods and importables. One of the main contributions of the Bergstrand framework was to show that in addition to incomes and trade costs, bilateral trade flows depended on importer and exporter price indices. Furthermore, with data on gross tariffs, this framework allows the researcher to identify the elasticity of substitution among importables ($\sigma $) and the elasticity of transformation parameter ($\gamma $). While the Bergstrand model featured price indices that were related to trade and trade costs, it was not clear at the time how to account for these indices.
One weakness of the Armington models of Anderson (1979) and Bergstrand (1985) was that product differentiation was determined arbitrarily depending on the country of origin. The new trade models of Krugman (1979), Krugman (1980) and Helpman and Krugman (1989), which were developed to account for intraindustry trade, provided a richer supplyside model of production. The key characteristics of these models are that the market structure is monopolistically competitive, consumers have a love of variety, firms within a country all have the same production technology, and the production technology for each firm exhibits increasing returns to scale.
Specifically, consumer preferences are defined by a utility function with constant elasticity of over the varieties of a good
where $\omega $ represents a distinct variety. These preferences are typically referred to DixitStiglitz preferences. Given that firms are homogeneous within a country and that preferences are symmetric over all varieties, the utility function of the representative agent can be expressed as
where ${N}_{i}$ is the number of varieties produced in country $i$. The representative agent maximizes her utility subject to a budget constraint, which is given by ${w}_{j}={\displaystyle \sum}_{i=1}^{N}{N}_{i}{p}_{ij}{c}_{ij}$. The demand for traded goods of consumers in country $j$ for goods produced in country $i\phantom{\rule{0.2em}{0ex}}$ is given by
where ${P}_{j}={\left\{{\displaystyle \sum}_{i=1}^{N}{N}_{i}{p}_{ij}^{1\sigma}\right\}}^{\frac{1}{1\sigma}}$.
In this model, firms face a fixed cost of production and constant marginal cost. With labor as the only input, production of a representative firm in country $i$ is given by ${q}_{i}={A}_{i}\left({l}_{i}{f}_{i}\right)$ where ${q}_{i}$ is the output produced by the firm in country $i$, ${A}_{i}$ is the technology available to firms in country $i$, ${l}_{i}\phantom{\rule{0.2em}{0ex}}$ is the labor employed by the representative firm in country $i$, and ${f}_{i}$ is the fixed production cost for the representative firm in country $i$. Given the production technology, the wage bill for the firm is given by ${w}_{i}{l}_{i}={w}_{i}\left(\frac{{q}_{i}}{{A}_{i}}+{f}_{i}\right)$. As in the Armington model, trade involves iceberg trade costs so that total production of a variety produced in country $i$ and shipped to all other markets is constrained by
Profit maximization implies
The firstorder conditions can be arranged to show that prices, inclusive of iceberg trade costs, are markups over marginal costs
Freeentry implies zero excess profits, so that total output of each firm in country $i\phantom{\rule{0.2em}{0ex}}$ is given by$\phantom{\rule{0.2em}{0ex}}{q}_{i}=\left({A}_{i}{f}_{i}\left(\sigma 1\right)\right)$. Labor market clearing implies that ${L}_{i}=\sum \left(\frac{q}{{A}_{i}}+{f}_{i}\right)={N}_{i}\sigma {f}_{i}$ or ${N}_{i}=\frac{{L}_{i}}{\sigma {f}_{i}}$, so that aggregate production in market $i$ is given by ${Y}_{i}={p}_{i}{Q}_{i}={p}_{i}{N}_{i}{q}_{i}=(\frac{{w}_{i}}{{A}_{i}\rho}\left(\frac{{A}_{i}{L}_{i}\left(\sigma 1\right)}{\sigma}\right)={w}_{i}{L}_{i}$
A conditional equilibrium gravity equation is obtained by substituting ${p}_{ij}={p}_{i}{t}_{ij}$ in the demand equation and substituting ${N}_{i}=\frac{{Y}_{i}}{{p}_{i}{q}_{i}}$ so that the demand equation can be expressed as
Assuming that technology is the same across countries this equation can be estimated in log form as
where the trade costs were modeled as
This conditional equilibrium gravity equation includes the price of goods produced in $i$ and the price index for the commodities consumed in country $j$. Baier and Bergstrand (2001) used GDP deflators as empirical proxies for the price terms. However, as Feenstra (2004) pointed out, the GDP deflators do not reflect the international prices implied by the theory, so it should come as no surprise when the price terms were often statistically insignificant.
In order to obtain unbiased coefficient estimates, the empirical specification needs to account for the price terms or a more theoretically consistent measure of the remoteness terms (${p}_{i}\phantom{\rule{0.2em}{0ex}}\text{and}\phantom{\rule{0.2em}{0ex}}{P}_{j}$). Since these variables are functions of the trade costs, they are likely correlated with the other righthand side variables. As a result, failing to account for them correctly will lead to biased coefficient estimates. Anderson and van Wincoop (2003) were the first to provide guidance on estimating the gravity equation and accounting for the general equilibrium price terms in a theoretically consistent way. They showed that the price terms in the gravity equation were implicit functions of the trade costs, incomes, and expenditures confirming what Linnemann (1966) had suggested many years ago when he stated that prices were an equilibrium outcome of the trade costs and (assumed to be) exogenous incomes. However, unlike Linneman, who had suggested that the price terms could be ignored, Anderson and van Wincoop stressed the importance of controlling for these price terms in order to obtain consistent estimates. In their application, Anderson and van Wincoop addressed what had been termed the “border puzzle.” This referred to McCallum (1995) who found that controlling for distance and economic size, the trade between Canadian provinces was 22 times higher than trade between Canadian provinces and U.S. states. However, once the theoretically consistent price terms are accounted for and using the comparative statics outlined in Anderson and van Wincoop, the impact of the border is dramatically reduced. Since the publication of the Anderson and van Wincoop paper, nearly all empirical specifications have attempted to account for the price terms.
Structural Gravity
The remainder of this section explains how once the market structure and market clearing are taken into account, the Armington model, the monopolistically competitive trade model, the Eaton and Kortum (2002) Ricardian model, and Melitz’s (2003) heterogeneous firms model can all be written in a similar form. The socalled structural gravity equation takes the following shape
Where like before, $G$ is a constant term, ${t}_{Ij}$ are trade costs between countries $i$ and $j$, ${Y}_{i}$ is production in country $i$, ${E}_{j}$ is aggregate expenditures by country $j$, and $\u03f5$ is the trade elasticity. As will be explained in more detail, the exporter and importer price indexes (${\text{\Pi}}_{i}$ and ${P}_{j}$) aggregate the trade costs over all trading partners
Structural Gravity in the Armington Model
The Armington model assumes that preferences can be represented by a CES utility function where each country’s good enters into the utility function symmetrically.^{7} More formally, the utility function of the representative agent in country $j$ is given by
The agents maximize their utility subject to a budget constraint, which states that expenditures equal income plus the trade deficit: ${e}_{j}={w}_{j}+{d}_{j}\ge {\displaystyle \sum}_{i=1}^{N}{p}_{ij}{c}_{ij}$. The underlying assumption is that the trade deficit is attributable to macroeconomic factors that are not influenced by current trade or trade policies. Maximizing Equation 7 subject to the budget constraint and aggregating over consumers in country $j\phantom{\rule{0.2em}{0ex}}$ yields the following expression country $i\text{'}s$ consumption of country$\phantom{\rule{0.2em}{0ex}}{j}^{\text{'}}\text{s}$ good
where the price index is given ${P}_{j}={\left[{\displaystyle \sum}_{i=1}^{N}{p}_{ij}^{1\sigma}\right]}^{\frac{1}{1\sigma}}$, ${C}_{ij}={c}_{ij}{L}_{j}\phantom{\rule{0.2em}{0ex}}$ is aggregate consumption of goods produced in country $\text{i}$ and consumed in $\text{j}$, and total expenditures in country $\text{j}$ are given by ${E}_{j}={e}_{j}{L}_{j}$.
Assuming that markets are perfectly competitive, the landed price of a good will be equal to the factory gate price scaled up by the iceberg trade costs; that is, ${p}_{ij}={p}_{i}{t}_{ij}$. The value of bilateral exports from $i$ to $j$ is given by
and the price index can be expressed as ${P}_{j}={\left[{\displaystyle {\sum}_{i=1}^{N}{\left({p}_{i}{t}_{ij}\right)}^{1\sigma}}\right]}^{\frac{1}{1\sigma}}$.
Market clearing implies ${Y}_{i}={\displaystyle \sum}_{i=1}^{N}{X}_{ij}$
or
or
where ${\text{\Pi}}_{i}={\left[{\displaystyle \sum}_{j=1}^{N}{\left(\frac{{t}_{ij}}{{P}_{j}}\right)}^{1\sigma}{E}_{j}\right]}^{\frac{1}{1\sigma}}$.
Substituting ${p}_{i}^{1\sigma}$ from Equation 8 into the price index and the trade flow equation yields the following set of equations
These price terms are referred to in Anderson and van Wincoop as the multilateral resistance terms. The outward multilateral resistance term, ${\text{\Pi}}_{i}$, is a weighted aggregate of all trade costs faced by the exporters in country $i$. While the inward multilateral resistance term, ${P}_{j}$, is a weighted aggregate of all trade costs faced by the importers in country $i$. Therefore, what matters for the volume of trade is the vector of bilateral trade costs relative to the inward and outward multilateral resistance terms. Finally, from an empirical standpoint, the trade elasticity in Armington model is constant and is determined by the elasticity of substitution across goods $\left(\sigma \right)$.
In order to close the model, labor is assumed to be the only input in the production of good $i$ and that production function for the good produced in country $i$$\left(i=1,\dots ,\phantom{\rule{0.2em}{0ex}}N\right)$ exhibits constant returns to scale. If technology in country $i$ is given by ${A}_{i}$, then the factory gate price is given by ${p}_{i}={W}_{i}/{A}_{i}$. Market clearing implies ${W}_{i}{L}_{i}={\displaystyle \sum}_{i=1}^{N}{X}_{ij}={p}_{i}^{1\sigma}{\text{\Pi}}_{i}^{1\sigma}$. After substituting for ${p}_{i}$, the market clearing condition can be rearranged to solve for wage rate; that is,
As one would expect, an increase in technology in country $i$ (${A}_{i}$) increases the wage rate in country $i$. In addition, having better access to markets reduces the outward multilateral resistance term $\left({\text{\Pi}}_{i}\right)$ and pushes up the wage rate similar to technological improvement. Finally, in the Armington model, increasing the population (the number of workers) lowers the wage because this increases the supply of country $i\text{'}s$ good and lowers the price, which ends up being reflected back on factor prices.
Structural Gravity and Monopolistic Competition with Homogenous Firms
As in the previous section, agents have symmetric preferences over varieties of goods. Since each firm has identical technology within a country, the value of trade from country $i$ to country $j\phantom{\rule{0.2em}{0ex}}$ is given by
where ${P}_{j}={\left[{\displaystyle \sum}_{i=1}^{N}{N}_{i}{p}_{ij}^{1\sigma}\right]}^{\frac{1}{1\sigma}}$ is the CESprice index for the consumer in country$\phantom{\rule{0.2em}{0ex}}j$ and ${N}_{i}$ is the number of varieties (goods) produced in country $i$. As in the Armington model, the landed price in country $j\phantom{\rule{0.2em}{0ex}}$ is equal to the factory gate price scaled by the iceberg trading costs (${p}_{ij}={p}_{i}{t}_{ij}$) and market clearing implies ${Y}_{i}={\displaystyle \sum}_{i=1}^{N}{X}_{ij}$ or
Substituting for ${N}_{i}{p}_{i}^{1\sigma}$ in the CESprice index and into the bilateral trade flow equation yields the following system of equations
As in the Armington model, the trade elasticity is determined by the elasticity of substitution across (varieties) of goods $\left(\sigma \right)$ and is constant.
Factor prices are pinned down by substituting these into the market clearing condition for goods and labor: ${Y}_{i}={W}_{i}{L}_{i}$ and ${N}_{i}=\frac{{A}_{i}{L}_{i}}{\sigma {f}_{i}}$. Rearranging gives us
where ${B}_{MC}=\phantom{\rule{0.2em}{0ex}}({f}_{i}^{\frac{1}{\sigma}}\left({\sigma}^{1}{\left(\sigma 1\right)}^{\frac{\sigma 1}{\sigma}}\right)$. Equations (12)—(15) yield a system of equations for trade flows, the multilateral resistance terms, and factor prices. As in the Armington model, factor prices are influenced by technology and the outwardmultilateral resistance term. However, unlike the Armington model, the wage rate does not depend on the supply of labor. Instead, an increase in the amount of labor in country $i$ leads to an increase in the number of varieties in country $i$. Since consumers have preferences over varieties, the increase in demand perfectly offsets the increased production and the wage rate does not change.
The supply side of both the Armington and the monopolistically competitive trade model are rather simplistic. The Armington model assumes that each country produces a single good and that the producers in the country do not face direct competition. Within a country, production exhibits constant returns to scale, so that pricing and market demand follow directly. In the monopolistically competitive trade model, households’ love of variety and increasing returns to production are essential for pinning down the number of varieties and the size of the firm. However, in most cases, it is assumed that each firm has access to the same technology so that production, sales, and exports are the same for all firms. The models developed by Eaton and Kortum (2002) and Melitz (2003) provide richer supply side models of international trade. The EatonKortum model is a Ricardian model with perfect competition where agents have preferences over varieties, and consumers buy from the lowcost producer. The Melitz model builds on the models developed by Krugman (1979, 1980) and Helpman and Krugman (1989) by modeling heterogeneous firms that differ in terms of their productivity.
Structural Gravity in the MultiCountry Ricardian Model
Eaton and Kortum (2002) extended the classic Dornbusch, Fischer, and Samuelson (1977) Ricardian model with a continuum of goods to a multicountry setting. In this setting, goods within a category are homogenous, and Ricardian differences in technology imply that trade is driven by comparative advantage. It will be shown that in this framework, the structural gravity model also emerges. Unlike the previous models, trade elasticity is determined by the dispersion parameter on the Frechet distribution, which determines the dispersion of productivity across firms in different countries.
As in the monopolistically competitive model, aggregate demand by country $j$ for variety $\omega $ is given by
Eaton and Kortum assume that the technical efficiency of a firm in country $i\phantom{\rule{0.2em}{0ex}}$ is determined by a random draw from a Frechet distribution. The CDF of this distribution is given by ${F}_{i}\left(z\right)=\text{exp}\left({T}_{i}{z}_{i}^{\theta}\right)$ where ${T}_{i}$ is a countryspecific parameter reflecting the productivity distribution in $i$ (${T}_{i}>0$), and $\theta $, common to all countries, represents the dispersion of technology. Eaton and Kortum assume that the productivity draws are independent. Furthermore, if labor is the only input into the production process and production exhibits constant returns to scale, the factory gate price for variety $\omega $ produced in country $i$ is given by ${p}_{ii}\left(\omega \right)={W}_{i}/{z}_{i}$ where ${z}_{i}$ is the technology of the firm producing the good.^{8}
Given that the productivity draws are independent, it can be shown that the probability that a producer in country $i$ can deliver the product to country $j$ at a price lower than or equal to $p$ is given by
where ${\text{\Phi}}_{j}=\left\{{\displaystyle \sum}_{k=1}^{N}{T}_{i}{\left({W}_{i}{t}_{ij}\right)}^{\theta}\right\}$. The probability that country $i\text{'}s\phantom{\rule{0.2em}{0ex}}$ good sells for the lowest price in market $j\phantom{\rule{0.2em}{0ex}}$ is given by
Given this probability, bilateral trade from $i$ to $j\phantom{\rule{0.2em}{0ex}}$ can be expressed as
Market clearing implies that
Eaton and Kortum show that the price index in country $j$ is given by ${P}_{j}=\gamma {\text{\Phi}}_{j}^{\frac{1}{\theta}}$. Substituting the price index into the market clearing condition and into the trade flow equation yields
Goods market clearing implies that the wage rate is given by
where ${\overline{z}}_{i}={e}^{0.577}{T}_{i}^{\theta}$ is the geometric mean of ${z}_{i}$.
Structural Gravity with Heterogeneous Firms
In the standard monopolistically competitive trade model, all firms are assumed to be identical, and all firms export. However, firmlevel data shows that firms differ in terms of a host of characteristics, including size and productivity, and that the latter is highly correlated with trade participation. By allowing for differences in firmlevel productivity and a fixed cost of exporting, Melitz (2003) developed a model that can account for several of these features that are present in the data. Chaney (2008) and Redding (2011) show that when the distribution of firmlevel productivity is characterized by a Pareto distribution, the response of bilateral trade flows to changes in trade costs results in a clean decomposition of the extensive and intensive margins of trade. This section will show how the Melitz model works and where the distribution of productivity can be characterized by a Pareto distribution. This yields a structural gravity equation that is similar to the gravity equation derived in previous sections. Similar to the Eaton and Kortum model, the trade elasticity with respect to marginal trade costs is given by the parameters of the Pareto distribution.
As in the previous section, consumer’s preferences are characterized by the DixitStiglitz CESutility function defined over varieties. Aggregate demand in country $j$ for variety $\omega \phantom{\rule{0.2em}{0ex}}$ is given by
For a producer of the variety $\omega $ in country $i$, profits from selling in market $j\phantom{\rule{0.2em}{0ex}}$ are given by
where ${A}_{i}$ is the aggregate technology in country $i$, $\phi $ is the firmspecific productivity, and ${f}_{ij}^{X}$ is the fixed cost of exporting from market $i$ into market $j$. Profit maximization implies that the price of the varieties shipped from country $i\phantom{\rule{0.2em}{0ex}}$ to country $j$ will depend on the firm’s productivity draw ($\omega $); that is, the price of goods shipped from country $i$ to country $j$ are given by
where $\rho =\frac{\sigma 1}{\sigma}$. The profits earned by firms in country $i$ with productivity $\phi $ that sells in country $j$ are given by
Melitz defined the cutoff productivity ${\phi}_{ij}^{*}\phantom{\rule{0.2em}{0ex}}$ such that the firm’s profits are exactly zero: $\pi \left({\phi}_{ij}^{*}\right)=0$.
As in Chaney (2008) Redding (2011) and Melitz and Redding (2014), productivity is assumed to follow a Pareto distribution where the cumulative density is given by $G\left(\phi \right)=1{\left(\frac{\overline{\phi}}{\phi}\right)}^{\kappa}$ and is defined on the support $\left[\overline{\phi},\infty \right)$. Given the productivity distribution and the definition of the zerocutoff productivity, the expected profits for firms in country $i$ selling in country $j\phantom{\rule{0.2em}{0ex}}$ are given by
and expected profits in market j by all active firms in country $i$ are given by
Aggregating across all markets delivers an expression for expected profits
Free entry implies that the expected profits, conditional on a productivity draw greater than or equal to ${\phi}_{ii}^{\phantom{\rule{0.2em}{0ex}}*}$, are equal to the fixed costs of entry (${f}_{i}^{E}$
) in terms of domestic labor units; that is,
As in the Krugman model, the labor market clearing condition pins down the mass of firms in each country
Given the mass of firms, exports from country $i$ to county $j$ are given by
where $G\left({\phi}_{ik}^{\text{*}}\right)=1{\left(\frac{{\phi}_{ik}^{\text{*}}}{\phi}\right)}^{\kappa}\phantom{\rule{0.2em}{0ex}}\forall \phantom{\rule{0.2em}{0ex}}k$. Substituting ${\phi}_{ij}^{\text{*}}{W}_{i}{f}_{ij}^{X}={\left(\frac{{W}_{i}{t}_{ij}}{\rho {A}_{i}{P}_{j}}\right)}^{1\sigma}{E}_{j}\phantom{\rule{0.2em}{0ex}}$ yields
The first term in brackets is the mass of firms in country $i$ that actively export to country $j$, which following Redding (2011) can be viewed as the extensive margin. A change in variable trade costs or fixed trade costs will impact the extensive margin through its impact on the cutoff productivity, ${\phi}_{ij}^{*}$. The second term in brackets, in turn, shows that the intensive margin is a function of the fixed cost of exporting, ${f}_{ij}^{X}$.
In order to express the bilateral trade equation in the form of structural gravity, the cutoff productivity of Equation 21 can be substituted into the bilateral trade equation to obtain
where ${B}_{m}=\overline{\phi}\phantom{\rule{0.2em}{0ex}}\frac{{\left(\sigma 1\right)}^{\kappa +1}{\sigma}^{\frac{\kappa \sigma}{\sigma 1}}}{\kappa \sigma +1}$ . Market clearing implies
Defining ${\text{\Pi}}_{i}$ as
and
bilateral trade can be expressed as
Extensions
In the models reviewed in the previous section, all trade costs were modeled as iceberg trade costs. However, some trade costs create rents, and how these rents are (re)distributed can impact trade flows. For example, if there are ad valorem tariffs (i.e., tariffs on the value and not the quantity of the good) and those tariffs are distributed as lumpsum payments to households, a structural gravity equation emerges with tariffs included as part of the trade cost vector. Moreover, the inclusion of tariffs may allow the researcher to identify key parameters of the model. In both the Armington model and the model with monopolistic competition, the structural gravity equation is given by
where ${\tau}_{ij}$ is the gross tariff rate. When data on ad valorem tariff rates is available, this can then be used to identify the elasticity of substitution across varieties.
Other examples where the structural gravity model emerges, is where the production function is modified to include intermediate goods. Eaton and Kortum (2002) and Redding and Venables (2004) provide theoretical models that include intermediate goods used in the production process, as described in Fujita, Krugman, and Venables (1999). They show that when production technology is represented by a CobbDouglas production function using labor and a CESaggregate of intermediates goods, a structural gravity equation emerges. The main difference between these models and, for example, the Armington model is that factor prices will be influenced by the inwardmultilateral resistance terms, ${\text{\Pi}}_{i}$. Intuitively, better access to foreign intermediaries tends to raise the returns to the domestic factors of production. In addition, a sectoral gravity equation emerges when there are many sectors, and the demand for the varieties produced in different sectors is weakly separable in the production function and/or in the utility function. Anderson and Yotov (2016) estimated a sectoral structural gravity equation, and they highlight how trade costs vary across sectors. Redding and Weinstein (2019) used a nested CES demand system to show how a loglinear gravity equation can be estimated and aggregated, and how it is possible to decompose the overall effects of different trade costs into different components reflecting the sectoral gravity equation estimates.
Hallak (2006), Hallak (2010), and Baldwin and Harrigan (2011) allowed for varieties to differ by quality. In these models, the demand for highquality products increases with the consumer’s income. On the supply side, highincome countries also tend to produce highquality goods. This is either because they are more likely to produce highquality goods so that they can satisfy the local market demands. Alternatively, highincome countries are more capable of producing highquality goods because their firms are, on average, more productive and can therefore produce higherquality goods more efficiently. As a result, countries with similar per capita GDPs are expected to trade more (see Linder, 1961). Hallak (2010) used a sectoral gravitylike equation to show that bilateral pairs that have similar per capita GDPs should trade more.
In all of the models discussed so far, preferences are characterized by CESutility function. Novy (2013) derived a gravity equation where the demand system can be characterized by a translog demand system. Unlike the earlier models, Novy (2013) showed that the bilateral elasticity of trade with respect to trade costs is not constant. Behrens and Murata (2012) and Arkolakis, Costinot, Donaldson, and RodriiguezClare (2015) showed that the structural gravity model can be obtained when agents have constant absolute risk aversion utility functions. In this case, there is a choke price that prevents all firms from exporting. Nevertheless, because the firm technology is Pareto distributed in the ACDRC framework, there will always be positive bilateral trade because of the unbounded Pareto distribution, and a gravitylike equation can be obtained when preferences are characterized by constant absolute risk aversion utility functions, and firm productivity is given by an unbounded Pareto distribution.^{9}
Empirical Gravity
From Tinbergen’s early application until the beginning of this century, trade economists working on the gravity model have focused mainly on either the theoretical foundations of the model or on expanding the list of covariates used to identify other natural, historical, cultural, and policyrelated variables that affect bilateral trade. During this time, nearly every empirical application estimated the gravity model in loglinear form using ordinary least squares (OLS).^{10} The influential work by Santos Silva and Tenreyro (2006) called into question the use of the loglinear specification. Santos Silva and Tenreyro (hereafter SST) argued that the error term was likely heteroskedastic, and the variance was likely a function of the righthand side control variables. If this were the case, the coefficient estimates from the loglinear would be inconsistent. SST proposed using a Poisson Pseudo Maximum Likelihood Estimator. This section discusses the properties of the different estimators, starting with the assumptions required for OLS to yield consistent estimates. It subsequently covers the estimation of the gravity model using Poisson Pseudo Maximum Likelihood Estimator (PPML), the Gamma Pseudo Maximum Likelihood Estimator (GPML), and Nonlinear Least Squares (NLS).^{11} This section closes by addressing the potential endogeneity of the policy variables and discussing how this has been addressed.
The LogLinear Model
Assuming that the expected value of trade is given by the structural gravity equation as derived in the previous section, observed bilateral trade is given by
In addition to the error term ${u}_{ijt}$, time subscripts have been added to the gravity model to allow for the estimation to include several years of bilateral trade data. In some instances, it will be more convenient to substitute for the trade cost vector in the functional form and express Equation 22 as
Most early applications of the gravity model estimated Equation (22) without accounting for the multilateral resistance terms. To illustrate, the first column of Table 1 presents the coefficients of the loglinear model estimated using pooled OLS for fiveyear intervals from 1974 to 2014.^{12} Included in these regressions are time dummies that capture the yearly variations in worldwide trade. The results are consistent with many papers estimating the loglinear gravity. Specifically, the coefficients on GDP are (relatively) close to unity, the absolute value of the distance elasticity is close to unity, and the coefficients on the common language, contiguity, colony, and the FTA indicator variables all indicate that they have a positive effect on trade.
Addressing Multilateral Resistance
There are a number of reasons why the coefficient estimates from this specification may be inconsistent. The most obvious, given the theoretical discussion in the previous section, is that this specification does not include controls for the multilateral resistance terms.^{13} Anderson and van Wincoop used an iterative nonlinear least squares estimator that computes and incorporates the multilateral resistance terms. However, a more straightforward way to account for the multilateral resistance terms that avoids custom programming is to include importer and exporter fixed effects. The inclusion of these fixed effects means that the trade elasticity with respect to GDPs can no longer be identified directly. When extended to a panel setting, Baldwin and Taglioni (2006) and Baier and Bergstrand (2007) emphasized that the theoretically consistent fixed effects should be specified as exporteryear and importeryear fixed effects.
The structural gravity equation can now be rewritten as
where ${\text{X}}_{\text{ij}}^{\text{SG}}=\mathrm{exp}\left({Z}_{ijt}\beta +{\delta}^{X}{D}_{it}+{\delta}^{M}{D}_{jt}\right)$, ${Z}_{ijt}$ is a $k$dimensional vector capturing bilateral trade costs, and ${D}_{it}\phantom{\rule{0.2em}{0ex}}\left({D}_{jt}\right)$ are exporteryear (importeryear) dummy indicators. One issue that arises in estimating equation (23) is that as the number of countries and years in the panel rises, the estimation of the coefficients on the dummy indicators becomes increasingly difficult and time consuming. This challenge led to Baier and Bergstrand’s (2009) linearized version of the multilateral resistance terms, which greatly reduced the number of parameters and allowed for the inclusion of importer and exporter specific effects. However, these technical issues are less of a concern now that most statistical packages have custom programs that allow the researcher to estimate models using high dimensional fixed effects.
Table 1. Comparison of Different Estimators
(1) 
(2) 
(3) 
(4) 
(5) 


VARIABLES 
OLS 
OLS FE 
PPML FE 
GPML FE 
NLLS FE 
Importer GDP 
1.020*** 

(0.004) 

Exporter GDP 
1.230*** 

(0.004) 

(log) DIST 
–1.040*** 
–1.391*** 
–0.582*** 
–1.236*** 
–0.505*** 
(0.011) 
(0.012) 
(0.016) 
(0.011) 
(0.044) 

CONTIG 
0.328*** 
0.476*** 
0.393*** 
0.773*** 
0.455*** 
(0.053) 
(0.052) 
(0.039) 
(0.046) 
(0.086) 

LANG 
0.762*** 
0.827*** 
0.077** 
0.521*** 
0.004 
(0.022) 
(0.022) 
(0.039) 
(0.021) 
(0.145) 

Colony 
1.303*** 
0.976*** 
0.225*** 
1.073*** 
0.006 
(0.053) 
(0.038) 
(0.040) 
(0.035) 
(0.084) 

FTA 
1.515*** 
0.325*** 
0.585*** 
0.465*** 
0.556*** 
(0.030) 
(0.027) 
(0.036) 
(0.025) 
(0.128) 

Constant 
–1.681*** 
19.879*** 
24.859*** 
18.714*** 

(0.110) 
(0.457) 
(0.426) 
(0.461) 

Observations 
110,516 
110,741 
111,237 
111,237 
111,237 
Rsquared 
0.624 
0.744 
*** Standard errors in parentheses p<0.01,
** p<0.05,
* p<0.1
Heteroskedasticity and the Structural Gravity Model
For many years, nearly all empirical papers estimated a loglinear gravity model using ordinary least squares, which in many instances may have led to biased coefficient estimates. If the error term is heteroskedastic and variance of the error term is correlated with the righthand side variables, the estimates are likely to be biased. Estimating equation (23) in log levels will yield consistent estimates under the following conditions
Furthermore, in the presence of missing trade data, the OLS coefficient estimates will only be consistent when the data are completely missing at random, or the missing observations are functions of the righthand side controls but independent of the error terms. There are statistical tests that can be performed to check whether the zero trade flows are economically determined as opposed to missing at random. Perhaps the simplest of these tests is to estimate Equation 22 while including an indicator variable that shows if the bilateral pair has positive trade flows in the subsequent period. If the coefficient on this variable is statistically significant, the zero trade flows are likely economically determined.
Alternatively, Helpman, Melitz, and Rubinstein (2008) employ a Heckmanlike correction to account for firm heterogeneity and zero trade flows. They develop a model where firm productivity is drawn from a truncated Pareto distribution. They then show how to account for firm heterogeneity empirically and how to employ a twostep Heckman correction for selection into trade. As is typical with Heckman corrections, the researcher needs to find a variable that influences the extensive margin of trade without impacting the intensive margin of trade. HMR used data from the World Bank’s “Doing Business” report for a core set of countries and used religion as the identifying variable for a broader group of countries. However, Santos Silva and Tenreyro (2015) showed that the HMR specification is only valid under relatively strong distributional assumptions and that standard statistical tests to assess these assumptions were rejected.
Gamma and Poisson Estimators
While controlling for the multilateral resistance terms helps to account for the correlation between the trade costs and the error term, there are other reasons to suspect the coefficient estimates may be inconsistent. SST showed that the loglinear specification leads to inconsistent estimates if the error term is heteroskedastic, and the variance depends on the righthand side control variables. To see how the heteroskedasticity is likely to depend on the righthand side variables the gravity equation is rewritten as
where $E({\nu}_{ijt}{Z}_{ij},{D}_{i},\phantom{\rule{0.2em}{0ex}}{D}_{j})=\text{exp}\left(h({Z}_{ij}\right)*{e}_{ijt})$ and ${e}_{ijt}~N\left(0,{\sigma}^{2}\right)$ so that ${\nu}_{ijt}$ is lognormal with a zero mean and variance that is a function of the ${Z}_{ij}$’s. Then the expected value $E\left[\mathrm{ln}\left({v}_{ijt}\right)\text{}{Z}_{ij}\right]=\frac{1}{2}{\sigma}_{v}^{2}$ so that the coefficient estimates of the log linear model would be given by
When heteroskedasticity is present, and the conditional mean function is exponential, SST showed that the PPML estimator provides consistent estimates.^{14} However, there are other Pseudo Maximum Likelihood Estimators that will also lead to consistent estimates of the parameters of interest. The firstorder conditions for this class of models include
In the case of the PPML estimator, the variance is proportional to the mean, and so the firstorder conditions include
For the Gamma Pseudo Maximum Likelihood estimator (GPML) where the variance is proportionate to the square of the mean, the firstorder conditions would include
The term in brackets is the percentage difference in the actual trade from the predicted trade. As Head and Mayer (2014) pointed out, this term may be roughly equal to the log difference in actual trade and predicted trade; in which case, the coefficient estimates may be similar to those using OLS.
Nonlinear Least Squares
The final specification discussed in this section is the nonlinear least squares (NLS). For NLS, the variance is independent of the conditional mean so that the firstorder conditions include
As long as the conditional mean is correctly specified, and the sample size is sufficiently large, the coefficient estimates should be similar across these specifications.
Table 1 columns 3 to 5 include the results for the PPML, the GPML, and NLS. As expected, the GPML estimates are similar to the OLSFE model. The absolute value of the distance elasticity is lower for the Poisson model than it is for the OLSFE and GPML; this is quite common and was pointed out by SST. For the NLS model, the coefficient estimates on language and Colony are notably different from the other specifications.
Model Selection and Heteroskedasticity
In order to assess these models, a number of standard tests for functional form and for the presence of heteroskedasticity can be implemented. To test for the latter, SST used the Ramsey Reset test, but this test may be also be thought of as a test for functional form. The idea of the Ramsey Reset test is straightforward. After estimating the model, save the predicted values and rerun the model with the same controls along with squares of the predicted value and other higherorder terms. If these additional regressors are not statistically significant, the functional form is likely correctly specified and heteroskedasticity is not a problem. Another commonly used test is the MaMu (or Park) test for heteroskedasticity. For this test, you again save the fitted value from the original specification, create ${\widehat{V}}_{ijt}={\left({X}_{ijt}{\widehat{X}}_{ijt}\right)}^{2}$, and subsequently estimate the following model
Using the same estimator that generated the predicted values, a statistical test on the value on ${\lambda}_{1}$ can help discriminate among the models. If the coefficient estimate is close to one (two) the PPML (GPML) estimator is more efficient, and if the coefficient is close to zero, then NLS may be appropriate. In many cases, the coefficient estimate is somewhere between one and two. Head and Mayer (2014) ran a simulation exercise in which the variance structure is proportionate to the meanmaking PPML the most efficient estimate. They found the coefficient estimate on ${\lambda}_{1}$ to be close to 1.60. When ${\widehat{\lambda}}_{1}\phantom{\rule{0.2em}{0ex}}$ was significantly below two, the MaMu (Park) test was a nearperfect predictor for the model specification.^{15}
Given the advances in computing power and improvement in estimating techniques, best practices for reporting empirical results would include estimating the model using OLS, PPML, GPML, and potentially NLS.^{16} As Head and Mayer suggest, if all the coefficient estimates are similar, then there is little reason for concern. If the coefficient estimates are economically different, then the Ramsey Reset test and the MaMu (Park) test can provide additional insights into the correct empirical specification.
Endogenous Trade Policy
In many cases, the researcher may be concerned that the righthand side controls are not exogenous. This is most likely to arise when policy variables are included in the specification. Clearly, tariffs and trade agreements are the results of negotiations between bilateral pairs and are hence unlikely to be randomly distributed across bilateral pairs even after controlling other righthand side variables. By running a series of crosssectional gravity equations over time, Baier and Bergstrand (2007) showed that the estimated coefficients on free trade agreements are less stable compared to standard gravity controls.^{17} Table 2 presents the results for the gravity equation for fiveyear intervals from 1979 to 2014. In order to account for the endogeneity, one must find instruments that are correlated with the tariffs or trade agreements but uncorrelated with trade flows. An alternative is to assume that there are bilateral specific effects that evolve slowly over time to the point where the researcher can assume that they are constant. One can then estimate the model using bilateral fixed effects. Baier and Bergstrand found that when controlling for timevarying multilateral resistance using standard panel fixed effects, the coefficient on trade agreements was positive and significant and was robust to changes in the specification.
Table 2 presents the results at fiveyear intervals from the OLS specification from 1979 to 2014. The coefficient on trade agreements is negative and significant for several years, after which it becomes positive and significant. The coefficient on trade agreements ranges from −0.689 to 0.590. If the policy variables are correlated with the error term, the consistent estimation can be obtained by using standard instrumental variable techniques. Egger et al. (2011) and Magee (2003) are two notable examples that use IV estimation. Rather than taking the IV approach, Baier and Bergstrand (2007) assumed that the policy variables are correlated with an unobserved component that is fixed or sufficiently slow moving over time. If this assumption holds and all of the other conditions needed for consistent estimation for the loglinear gravity model are met, then consistent estimates can be obtained by fixed effects or first differencing the data.
Table 2. Stability of the Coefficients
(1) 
(2) 
(3) 
(4) 
(5) 
(6) 
(7) 
(8) 


VARIABLES 
OLS FE 1979 
OLS FE 1984 
OLS FE 1989 
OLS FE 1994 
OLS FE 1999 
OLS FE 2004 
OLS FE 2009 
OLS FE 2014 
(log) DIST 
−1.369*** 
−1.463*** 
−1.427*** 
−1.402*** 
−1.357*** 
−1.448*** 
−1.406*** 
−1.402*** 
(0.040) 
(0.041) 
(0.037) 
(0.031) 
(0.030) 
(0.033) 
(0.034) 
(0.033) 

CONTIG 
0.199 
0.016 
0.180 
0.691*** 
0.579*** 
0.634*** 
0.369** 
0.731*** 
(0.174) 
(0.179) 
(0.159) 
(0.138) 
(0.123) 
(0.139) 
(0.149) 
(0.141) 

LANG 
0.607*** 
0.574*** 
0.772*** 
0.910*** 
0.810*** 
0.912*** 
1.035*** 
0.884*** 
(0.078) 
(0.079) 
(0.072) 
(0.063) 
(0.057) 
(0.059) 
(0.061) 
(0.059) 

Colony 
1.317*** 
1.219*** 
0.961*** 
0.847*** 
0.943*** 
0.766*** 
0.749*** 
0.592*** 
(0.119) 
(0.123) 
(0.117) 
(0.108) 
(0.103) 
(0.113) 
(0.109) 
(0.113) 

FTA 
−0.579*** 
−0.689*** 
−0.351*** 
−0.499*** 
0.361*** 
0.430*** 
0.590*** 
0.453*** 
(0.146) 
(0.141) 
(0.115) 
(0.081) 
(0.065) 
(0.067) 
(0.064) 
(0.058) 

Observations 
8,389 
8,168 
8,766 
11,929 
14,390 
16,542 
17,140 
17,353 
Rsquared 
0.699 
0.699 
0.738 
0.757 
0.758 
0.753 
0.749 
0.762 
*** Robust standard errors in parentheses p<0.01,
** p<0.05,
* p<0.1
Baier and Bergstrand (2007) also included lags and lead to capture the dynamic aspects of trade agreements. The lagged values of the trade agreement variables detect the phasein effects, while the leads detect feedback effects (i.e., where large bilateral trade flows lead to the new trade agreements). Anderson and Yotov (2016) obtained qualitatively similar findings using a PPML estimator with bilateral fixed effects. Table 3 presents the results using a standard fixed effects estimation and the fixedeffect PPML estimator. In both specifications, there is evidence of economically and statistically significant lagged effects of trade agreements and little evidence of feedback effects. The fixed effect PPML estimates are also smaller than the standard loglinear fixed effect specification.
Table 3. Lagging and Leading Trade Agreements
(1) 
(2) 
(3) 
(4) 
(5) 
(6) 


VARIABLES 
OLS FE 
OLS FE 
OLS FE 
PPML FE 
PPML FE 
PPML FE 
FTA 
0.309*** 
0.148*** 
0.217*** 
0.097*** 
0.044* 
0.069*** 
(0.027) 
(0.032) 
(0.037) 
(0.025) 
(0.025) 
(0.025) 

FTA ( 5) 
0.134*** 
0.224*** 
0.082*** 
0.145*** 

(0.033) 
(0.036) 
(0.021) 
(0.027) 

FTA (10) 
0.100*** 
0.177*** 
−0.018 
0.013 

(0.033) 
(0.038) 
(0.026) 
(0.029) 

FTA (+5) 
−0.052 
−0.001 

(0.036) 
(0.030) 

ExporterYear FE 
Yes 
Yes 
Yes 
Yes 
Yes 
Yes 
ImporterYear FE 
Yes 
Yes 
Yes 
Yes 
Yes 
Yes 
Observations 
110,668 
93,904 
76,000 
112,415 
95,574 
77,960 
Rsquared 
0.875 
0.889 
0.896 
0.993 
0.994 
0.995 
*** Robust standard errors in parentheses p<0.01,
** p<0.05,
* p<0.1
The Current State and Future of Gravity
Over time, improvements to the data and theoretical innovations have resolved several of the empirical puzzles that trade economists identified when employing the gravity equation. McCallum’s border puzzle is one such issue and was addressed by Anderson and van Wincoop (2003). Another puzzle that has been widely discussed is the distance puzzle. Several studies have shown that the absolute value of the elasticity of trade with respect to distance has increased over time (see, e.g., Disdier & Head, 2008). Using data that includes gross production and intracountry trade, Yotov (2012) showed that the effect of distance on trade has declined over time when one measures the impact of distance on international relative to intranational trade. Caron, Fally, and Markusen (2014) argued that incorporating a gravity framework into a model with multiple sectors and nonhomothetic preferences addresses several puzzles in international trade.
More recently, the gravity framework has been used to assist in quantifying the general equilibrium impacts of trade policies and to assess the welfare implications. A typical assumption in most empirical specifications is that incomes and prices are assumed to exogenous, and this may be an appropriate assumption when the observation is bilateral trade. Given the theoretical developments of the gravity model, it is relatively easy to embed measured trade costs into the general equilibrium models and observe how changes in trade costs will impact prices and incomes. An important contribution that set the stage for the use of the gravity equation in evaluating trade policies was the paper of Arkolakis, Costinot, and RodríguezClare (2012). They showed that for a wide class of models, the welfare implication depends on the share of expenditures on domestically produced goods and the elasticity of trade with respect to (variable) trade costs. Another significant contribution that led to the gravity model’s use in evaluating trade policy was the smallscale model developed by Alvarez and Lucas (2007). Alvarez and Lucas showed how the Eaton and Kortum model could be calibrated to simulate changes in trade policy. Caliendo and Parro (2015) quantified the impact of the reduction in tariffs as a result of the North American Free Trade Association (NAFTA). Caliendo et al. (2017) used a quantitative gravity model to evaluate the impact of 20 years of tariff changes through the GATT/WTO and trade agreements. Felbermayr, Gröschl, and Steininger (2018) used a quantitative trade model to evaluate the impact of Brexit.
Since the advent of new trade theory, there has been an interest in linking trade, firm location, and economic geography. Early theoretical and empirical examples are Fujita et al. (1999) and Redding and Venables (2004). These papers focused on market access and supplier access. As more data and better data have become available, these models have used the gravity framework to address how market access and supplier access has impacted different areas (see, e.g., Donaldson & Hornbeck, 2016; Donaldson, 2018; Allen & Arkolakis, 2014; Ahlfeldt, Redding, Sturm, & Wolf, 2015).
In the future, several areas need to be addressed. As pointed out by Lai and Trefler (2002), the gravity equation does an excellent job of explaining crosssectional variation in trade flows but does not perform as well in explaining the growth of trade. The reason for this is somewhat obvious: for much of the post–World War II period, trade has increased faster than income. In order for the gravity model to explain the growth in trade, there must be changes in the trade costs that have led to an increase in trade. In most specifications, on the righthand side, bilateral control variables are constant over time and thus cannot explain the growth in trade. A related area of research would be able to provide a dynamic model of international trade both at the aggregate level and incorporating firm dynamics. Anderson, Larch, and Yotov (2015) used an Armington framework with capital accumulation to develop a dynamic general equilibrium gravity model. Sampson (2016) and Perla, Tonetti, and Waugh (2015) provided a dynamic model of trade and growth with heterogeneous firms. Finally, an area of future research is to have a better understanding of trade costs that are derived from first principles. Most theoretical developments have been in terms of firm production and preferences of the individual. In almost all examples, trade costs are simply assumed to be iceberg trade costs, and the functional form of the trade costs is loglinear. Chaney (2018) provided a model that helps to explain the role of distance in the gravity equation.
Further Reading
 Alvarez, F., & Lucas Robert, E. J. (2007). General equilibrium analysis of the EatonKortum Model of international trade. Journal of Monetary Economics, 54(6), 1726–1768.
 Anderson, J. E., & van Wincoop E. (2003). Gravity with gravitas: A solution to the border puzzle. American Economic Review, 93(1), 170–192.
 Baier, S. L., & Bergstrand, J. H. (2007). Do free trade agreements really increase members’ trade. Journal of International Economics 71(1), 72–95.
 Baier, S. L., & Bergstrand, J. H. (2009). Bonus vetus OLS: A simple method for approximating international tradecost effects using the gravity equation. Journal of International Economics, 77(1), 77–85.
 Baier, S. L., Kerr, A., & Yotov, Y. V. (2018). Gravity distance and international trade. In B. Blonigen & W. Wilson (Eds.), Handbook of international trade and transportation (pp. 15–78). Northampton, MA: Edward Elgar.
 Bergstrand, J. H., & Egger, P. (2011). Gravity equations and economic frictions in the word economy. In D. Bernhofen, R. Falvey, D. Greenaway, & U. Kreickemeierm (Eds.), Palgrave handbook of international trade. London: PalgraveMacmillan.
 Caliendo, L., Feenstra, R. C., University, Y., Davis, N. U., Romalis, J., & Taylor, A. M. (2017). Tariff reductions, entry, and welfare: Theory and evidence for the last two decades. National Bureau of Economic Research, Working paper series No. 21768.
 Eaton, J., & Kortum, S. (2002). Technology, geography and trade. Econometrica, 70(5), 1741–1779.
 Felbermayr, G., Gröschl, J., & Steininger, M. (2018). Brexit through the lens of new quantitative trade theory. In Annual Conference on Global Economic Analysis at Purdue University.
 Hallak, J. C. (2010). A productquality view of the Linder hypothesis. Review of Economics and Statistics, 92(3), 453–466.
 Head, K., & Mayer, T. (2014). Gravity equations: Workhorse, toolkit, and cookbook. In G. Gopinath, E. Helpman, & K. Rogoff (Eds.), Handbook of International Economics (Vol. 4, pp. 131–195). North Holland: Elsevier.
 Helpman, E., Melitz, M., & Rubinstein, Y. (2008). Trading partners and trading volumes. Quarterly Journal of Economics, 123(2), 441–487.
 Krugman, P. R. (1979). Increasing returns, monopolistic competition, and international trade. Journal of International Economics, 9(4), 469–479.
 Krugman, P. R. (1980). Scale economies, product differentiation, and the pattern of trade. American Economic Review, 70(5), 950–959.
 Melitz, M. J. (2003). The impact of trade on intraindustry reallocations and aggregate industry productivity. Econometrica, 71(6), 1695–1725.
 Melitz, M., & Redding, S. (2014). Heterogeneous firms and trade. In G. Gopinath, E. Helpman, & K. Rogoff (Eds.), Handbook of International Economics (Vol. 4, pp. 131–195). North Holland: Elsevier.
 Piermartini, R., & Yotov, Y. (2016). Estimating trade policy effects with structural gravity. School of Economics Working Paper Series. LeBow College of Business, Drexel University.
 Redding, S. J. (2011). Theories of heterogeneous firms and trade. Annual Review of Economics, 3(1), 77–105.
 Redding, S., & Venables, A. J. (2004). Economic geography and international inequality. Journal of International Economics, 62(1), 53–82.
 Redding, S., & Weinstein, D. (2019). Aggregation and the gravity equation. American Economic Review Papers and Proceedings, 109, 450–455.
 Sampson, T. (2016). Dynamic selection: An idea flows theory of entry, trade, and growth. The Quarterly Journal of Economics, 131(1), 315–380.
 Santos Silva, J. M. C., & Tenreyro, S. (2015). Trading partners and trading volumes: Implementing the HelpmanMelitzRubinstein Model empirically. Oxford Bulletin of Economics and Statistics, 77(1), 93–105.
 Silva, J. M. C. S., & Tenreyro, S. (2006). The log of gravity. Review of Economics and Statistics, 88(4), 641–658.
References
 Ahlfeldt, G. M., Redding, S. J., Sturm, D. M., & Wolf, N. (2015). The economics of density: Evidence from the Berlin Wall. Econometrica, 83(6), 2127–2189.
 Allen, T., & Arkolakis, C. (2014). Trade and the topography of the spatial economy. Quarterly Journal of Economics, 129(3), 1085–1140.
 Alvarez, F., & Lucas Robert, E. J. (2007). General equilibrium analysis of the EatonKortum model of international trade. Journal of Monetary Economics, 54(6), 1726–1768.
 Anderson, J. E. (1979). A theoretical foundation for the gravity equation. The American Economic Review, 69(1), 106–116.
 Anderson, J. E., & van Wincoop, E. (2003). Gravity with gravitas: A solution to the border puzzle. American Economic Review, 93(1), 170–192.
 Anderson, J. E., & Yotov, Y. V. (2016). Terms of trade and global efficiency effects of free trade agreements, 1990–2002. Journal of International Economics, 99, 279–298.
 Anderson, J., Larch, M., & Yotov, Y. (2015). Growth and trade with frictions: A structural estimation framework. National Bureau of Economic Research, Working paper series No. 21377.
 Arkolakis, C., Costinot, A., Donaldson, D., & RodríguezClare, A. (2015). The elusive procompetitive effects of trade. National Bureau of Economic Research, Working paper series No. 21370.
 Arkolakis, C., Costinot, A., & RodríguezClare, A. (2012). New trade models, same old gains? American Economic Review, 102(1), 94–130.
 Baier, S. L., & Bergstrand, J. H. (2001). The growth of world trade: Tariffs, transport costs, and income similarity. Journal of International Economics, 53(1), 1–27.
 Baier, S. L., & Bergstrand, J. H. (2007). Do free trade agreements actually increase members’ international trade? Journal of International Economics, 71(1), 72–95.
 Baier, S. L., Kerr, A., & Yotov, Y. V. (2018). Gravity distance and international trade. In B. Blonigen & W. Wilson (Eds.), Handbook of international trade and transportation (pp. 15–78). Northampton, MA: Edward Elgar.
 Baldwin, R. E., & Harrigan, J. (2011). Zeros, quality, and space: Trade theory and trade evidence. American Economic Journal: Microeconomics, 3(2), 60–88.
 Baldwin, R., & Taglioni, D. (2006). Gravity for dummies and dummies for gravity equations. National Bureau of Economic Research, Working paper series No. 12516.
 Behrens, K., & Murata, Y. (2012). Globalization and individual gains from trade. Journal of Monetary Economics, 59(8), 703–720.
 Bergstrand, J. H. (1985). The gravity equation in international trade: Some microeconomic foundations and empirical evidence. The Review of Economics and Statistics, 67(3), 474.
 Bergstrand, J. H., & Egger, P. (2011). Gravity equations and economic frictions in the world economy. In D. Bernhofen, R. Falvey, D. Greenaway, & U. Kreickemeierm (Eds.), Palgrave handbook of international trade. London: PalgraveMacmillan.
 Bertoletti, P., Etro, F., & Simonovska, I. (2018). International trade with indirect additivity. American Economic Journal: Microeconomics, 10(2), 1–57.
 Caliendo, L., Feenstra, R. C., University, Y., Davis, N. U., Romalis, J., & Taylor, A. M. (2017). Tariff reductions, entry, and welfare: Theory and evidence for the last two decades. National Bureau of Economic Research, Working paper series No. 21768.
 Caliendo, L., & Parro, F. (2015). Estimates of the trade and welfare effects of NAFTA. The Review of Economic Studies, 82(1), 1–44.
 Caron, J., Fally, T., & Markusen, J. R. (2014, May). International trade puzzles: A solution linking production and preferences. Quarterly Journal of Economics, 129(3), 1501–1552.
 Chaney, T. (2008). Distorted gravity: The intensive and extensive margins of international trade. American Economic Review, 98(4), 1707–1721.
 Chaney, T. (2018). The gravity equation in international trade: An explanation. Journal of Political Economy, 126(1), 150–177.
 Cheng, I.H., & Wall, H. J. (2005). Controlling for heterogeneity in gravity models of trade and integration. Federal Reserve Bank of St. Louis Review, 87(1), 49–63.
 Deardorff, A. V. (1998). Determinants of bilateral trade: Does gravity work in a neoclassical world? In J. Frankel (Ed.), The regionalization of the world economy (pp. 7–32). Chicago: University of Chicago Press.
 De Benedictis, L., & Taglioni, D. (2010) The Gravity Model in International Trade. In L. De Benedictis and L. Salvatici (Eds.), The Trade Impact of European Union Preferential Policies. Berlin: SpringerVerlag.
 Disdier, A.C., & Head, K. (2008). The puzzling persistence of the distance effect on bilateral trade. Review of Economics and Statistics, 90(1), 37–48.
 Donaldson, D. (2018). Railroads of the Raj: Estimating the impact of transportation infrastructure. American Economic Review, 108(4–5), 899–934.
 Donaldson, D., & Hornbeck, R. (2016). Railroads and American economic growth: A ‘market access’ approach. The Quarterly Journal of Economics, 131(2), 799–858.
 Dornbusch, R., Fischer, S., & Samuelson, P. (1977). Comparative advantage, trade, and payments in a Ricardian model with a continuum of goods. American Economic Review, 67(5), 823–839.
 Eaton, J., & Kortum, S. (2002). Technology, geography and trade. Econometrica, 70(5), 1741–1779.
 Eaton, J., Kortum, S. S., & Sotelo, S. (2012). Series international trade: Linking micro and macro. National Bureau of Economic Research, Working paper series No. 17864.
 Eaton, J., & Tamura, A. (1995). Bilateralism and regionalism in Japanese and U.S. trade and direct foreign investment patterns. National Bureau of Economic Research, Working paper series No. 4758.
 Egger, P., Larch, M., Staub, K. E., & Winkelmann, R. (2011). The trade effects of endogenous preferential trade agreements. American Economic Journal: Economic Policy, 3(3), 113–143.
 Fally, T. (2015). Structural gravity and fixed effects. Journal of International Economics, 97(1), 76–85.
 Feenstra, R. C. (2004). Advanced international trade: Theory and evidence by Robert C. Feenstra, 2004. Princeton, NJ: Princeton University Press.
 Felbermayr, G., Gröschl, J., & Steininger, M. (2018). Brexit through the lens of new quantitative trade theory. In Annual Conference on Global Economic Analysis at Purdue University.
 Frankel, J. A. (1997). Regional trading blocs. Washington, DC: Institute for International Economics.
 Fujita, M., Krugman, P. R., & Venables, A. J. (1999). The spatial economy—cities, regions, and international trade. Cambridge, MA: MIT Press.
 Hallak, J. C. (2006). Product quality and the direction of trade. Journal of International Economics, 68(1), 238–265.
 Hallak, J. C. (2010). A productquality view of the Linder Hypothesis. Review of Economics and Statistics, 92(3), 453–466.
 Harrigan, J. (2003). Specialization and the volume of trade: Do the data obey the laws. In K. E. Choi & J. Harrigan (Eds.), Handbook of international trade (1st ed., pp. 85–118). Oxford, U.K.: Blackwell.
 Head, K., & Mayer, T. (2014). Gravity equations: Workhorse, toolkit, and cookbook. In G. Gopinath, E. Helpman, & K. Rogoff (Eds.) Handbook of International Economics (Vol. 4, pp. 131–195). North Holland: Elsevier.
 Head, K., Mayer, T., & Reis, J. (2010). The erosion of colonial trade linkages after independence. Journal of International Economics, 81, 1–14.
 Helliwell, J. F. (1997). National borders, trade and migration. Pacific Economic Review, 2(3), 165–185.
 Helpman, E. (1987). Imperfect competition and international trade: Evidence from fourteen industrial countries. Journal of the Japanese and International Economies, 1(1), 62–81.
 Helpman, E., & Krugman, P. R. (1989). Trade policy and market structure. Cambridge, MA: MIT Press.
 Helpman, E., Melitz, M., & Rubinstein, Y. (2008). Trading partners and trading volumes. Quarterly Journal of Economics, 123(2), 441–487.
 Hummels, D. (2007). Transportation costs and international trade in the second era of globalization. The Journal of Economic Perspectives, 21(3), 131–154.
 Krugman, P. R. (1979). Increasing returns, monopolistic competition, and international trade. Journal of International Economics, 9(4), 469–479.
 Krugman, P. R. (1980). Scale economies, product differentiation, and the pattern of trade. American Economic Review, 70(5), 950–959.
 Lai, H., & Trefler, D. (2002). The gains from trade with monopolistic competition: Specification, estimation and misspecification. National Bureau of Economic Research, Working paper series No. 9169.
 Leamer, E. E., & Stern, R. M. (1970). Quantitative International Economics (First). Boston: Allyn and Bacon.
 Limao, N., & Venables, A. J. (2001, September). Infrastructure, geographical disadvantage, transport costs, and trade. The World Bank Economic Review, 15(3), 451–479.
 Linder, S. B. (1961). An essay on trade and transformation. Stockholm, Sweden: Almqvist & Wicksells.
 Linnemann, H. (1966). An econometric study of international trade flows. Amsterdam, The Netherlands: NorthHolland.
 Magee C. S. (2003). Endogenous preferential trade agreements: An empirical analysis. The B.E. Journal of Economic Analysis & Policy, 2(1), 1–19.
 McCallum, J. (1995). National borders matter. American Economic Review, 85(3), 615–623.
 Melitz, M. J. (2003). The impact of trade on intraindustry reallocations and aggregate industry productivity. Econometrica, 71(6), 1695–1725.
 Melitz, M., & Redding, S. (2014). Heterogeneous firms and trade. In G. Gopinath, E. Helpman, & K. Rogoff (Eds.), Handbook of International Economics (Vol. 4, pp. 131–195). North Holland: Elsevier.
 Novy, D. (2013). International trade without CES: Estimating translog gravity. Journal of International Economics, 89(2), 271–282.
 Perla, J., Tonetti, C., & Waugh, M. (2015). Equilibrium technology diffusion, trade, and growth. National Bureau of Economic Research, Working paper series No. 20881.
 Piermartini, R., & Yotov, Y. (2016). Estimating trade policy effects with structural gravity. School of Economics Working Paper Series. Philadelphia, PA: LeBow College of Business, Drexel University.
 Redding, S. J. (2011). Theories of heterogeneous firms and trade. Annual Review of Economics, 3(1), 77–105.
 Redding, S., & Weinstein, D. (2019). Aggregation and the gravity equation. American Economic Review Papers and Proceedings, 109, 450–455.
 Redding, S., & Venables, A. J. (2004). Economic geography and international inequality. Journal of International Economics, 62(1), 53–82.
 Sampson, T. (2016). Dynamic selection: An idea flows theory of entry, trade, and growth. The Quarterly Journal of Economics, 131(1), 315–380.
 Santos Silva, J. M. C., & Tenreyro, S. (2015). Trading partners and trading volumes: Implementing the HelpmanMelitzRubinstein Model empirically. Oxford Bulletin of Economics and Statistics, 77(1), 93–105.
 Silva, J. M. C. S., & Tenreyro, S. (2006). The log of gravity. Review of Economics and Statistics, 88(4), 641–658.
 Wei, S.J. (1996). Intranational versus international trade: How stubborn are nations in global integration? National Bureau of Economic Research, Working paper series No. 5531.
 Yotov, Y. V. (2012). A simple solution to the distance puzzle in international trade. Economics Letters, 117(3), 794–798.
Notes
1. This review covers much of the evolution of the gravity equation. Like any review, the choice of topics covered is selective. The interested reader may also want to review other surveys: in particular, Head and Mayer (2014), Allen and Arkolakis (2015) Bergstrand and Egger (2011), De Benedictis and Taglioni (2010), Scott L. Baier, Kerr, and Yotov (2018), and Piermartini and Yotov (2016).
2. The frictionless gravity model is derived in the next section. Leamer and Stern (1970) and Anderson (1979) and Deardorff (1998) are some of the earliest theoretical contributions to derive a frictionless gravity model.
3. The correlation between the natural logarithm of expenditure shares and production shares is 0.67.
4. Conditioning on the pair being contiguous, the elasticity of trade costs with respect to distance falls to 1.12.
5. Head, Mayer, and Reis (2010) highlighted this relationship.
6. Baier et al. (2018) highlight this more rigorously and develop a model that highlights these interactions between trade agreements and other trade costs.
7. This section ignores the specific taste parameter for a country’s good ($\beta $), allowing one to relate factor prices to a country’s production technology.
8. It is assumed that there are no internal trade costs so that the landed price in country $i$ is equal to the factory gate price.
9. More recently, Bertoletti, Etro, and Simonovska (2018) derive a gravitylike equation when agents have indirectly additive preferences.
10. Notable exceptions were Eaton and Tamura (1995) and Frankel (1997).
11. PPML, GPML, and NLS are in the same class of estimators commonly referred to as the generalized linear models (GLM).
12. For the reasons outlined in Cheng and Wall (2005) and Baier and Bergstrand (2007), fiveyear intervals are used.
13. As discussed earlier, these terms were typically approximated using “remoteness” variables that may be poor proxies for the multilateral resistance.
14. Another appealing feature of the PPML model is that the coefficient estimates on the exporteryear and importeryear fixed effects are the theoretically consistent multilateral resistance terms derived in the previous section (see Fally, 2015).
15. Egger, Larch, Staub, and Winkelmann (2011) examined the small sample properties of the different GLM estimators.
16. Eaton, Kortum, and Sotelo (2012) use a Multinomial Pseudo Maximum Likelihood estimator to estimate a gravity model using export shares. Head and Mayer (2014) find that this model performs well in the presence of zeros and when the variance of the error term is proportionate to the mean. However, it performs less well when the variance is proportional to the square of the mean.
17. Baier and Bergstrand (2007) attributed this instability to the endogeneity of the trade agreements.