Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Physics. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 03 December 2022

# The Partonic Content of Nucleons and Nuclei

• Juan RojoJuan RojoDepartment of Physics and Astronomy, VU University, and Nikhef Theory Group

### Summary

Deepening our knowledge of the partonic content of nucleons and nuclei represents a central endeavor of modern high-energy and nuclear physics, with ramifications in related disciplines, such as astroparticle physics. There are two main scientific drivers motivating these investigations of the partonic structure of hadrons. On the one hand, addressing fundamental open issues in our understanding of the strong interaction, such as the origin of the nucleon mass, spin, and transverse structure; the presence of heavy quarks in the nucleon wave function; and the possible onset of novel gluon-dominated dynamical regimes. On the other hand, pinning down with the highest possible precision the substructure of nucleons and nuclei is a central component for theoretical predictions in a wide range of experiments, from proton and heavy-ion collisions at the Large Hadron Collider to ultra-high-energy neutrino interactions at neutrino telescopes.

### Subjects

• Cosmology and Astrophysics
• Nuclear Physics
• Particles and Fields

Protons and neutrons, collectively known as nucleons, represent, together with electrons, the fundamental building blocks of matter and dominate the overall mass of the visible universe. Nucleons, as well as all other hadrons, are characterized by a rich internal substructure, being composed of elementary particles, quarks and gluons, collectively known as partons. These partons are tightly held together within nucleons thanks to the properties of the quantum field theory of the strong interaction: quantum chromodynamics (QCD).

The study of the partonic content of nucleons is one of the central endeavors of modern high-energy and nuclear physics. It is motivated by a number of fundamental open questions in our understanding of the strong interaction, such as the origin of the nucleon mass and spin, the three-dimensional profiling of hadron substructure, the role of heavy quarks in hadronic wave functions, and the potential onset of novel gluon-dominated dynamical regimes. In addition, deepening our knowledge of this partonic content of nucleons is of paramount importance for a wide range of theoretical predictions in high-energy processes, such as proton collisions at the Large Hadron Collider (LHC), heavy-ion collisions at the Relativistic Heavy Ion Collider (RHIC), and high-energy neutrino interactions, such as neutrino telescopes. Furthermore, the quark and gluon substructure of nucleons is modified once the latter become bound within heavy nuclei, leading to a remarkable pattern of nuclear effects whose study opens a novel window on the inner workings of the strong force in the nuclear environment.

The detailed investigation of the partonic content of nucleons and nuclei is, however, a challenging task. It requires the careful combination of state-of-the-art theoretical calculations and the widest possible range of experimental measurements by means of a statistically robust and efficient fitting methodology. This framework, the global QCD analysis of the nucleon structure, has experienced remarkable progress in recent years. Some important milestones include the assessment of the constraints on the proton structure provided by LHC measurements; the implementation of machine learning algorithms that speed up dramatically the analysis while minimizing procedural biases; the exploitation of precision–frontier calculations in both the strong and electroweak interactions; the deployment of tailored advanced computational tools, such as fast interfaces; novel methods to extract with high precision the photon content of the proton; an improved characterization of the role of the initial state of heavy-ion collisions for quark–gluon plasma studies; and the development of novel methods to estimate and propagate relevant sources of uncertainties to the final theory predictions.

Despite all these achievements, a long road still lies ahead, with pressing open questions, both of a theoretical and experimental nature, which are directly relevant for the full physics exploitation of current and future facilities. Some of these include puzzling results concerning the strange-quark content of protons, how gluons behave at very small and very large momentum fractions, the impact of lattice QCD simulations, the specific pattern of modifications in the partonic structure of heavy nuclei induced by nuclear effects, and the interplay between Standard Model measurements and searches for New Physics at the high-energy frontier.

### 1. Introduction

Elementary particles, such as leptons (e.g., electrons and neutrinos), do not have any known substructure. In contrast, hadrons (particles that experience the strong nuclear force) turn out not to be elementary but rather bound states composed of quarks and gluons, collectively known as partons. These partons are tightly held together within hadrons by virtue of the properties of the quantum theory of the strong interaction, QCD. Indeed, the mathematical structure of QCD implies that color-charged particles, such as quarks (color being the analog of the electric charge in the strong interaction), cannot exist in isolation and need to be confined within hadrons.

In addition to the light quarks (up and down) and gluons, hadrons also contain heavier quarks (strange and charm quarks in particular) as well as a photon component. Furthermore, the behavior of partons is modified once protons and neutrons (denoted as nucleons) become themselves the building blocks of heavy nuclei, reflecting a rich pattern of nuclear dynamics. Several of the most important properties related to the partonic constituents of hadrons, including their longitudinal and transverse momentum and spin distributions, are determined by the poorly understood nonperturbative regime of the strong force. In this regime, first principle calculations are challenging and most partonic properties need to be extracted from experimental data by means of a global QCD analysis.

Understanding the partonic content of hadrons and heavy nuclei plays a crucial role in modern particle, nuclear, and astroparticle physics from a twofold perspective, summarized in Figure 1. On the one hand, these partonic properties offer a unique window to address fundamental questions about the inner workings of the strong interaction and hadron structure, such as what is the origin of the nucleon mass and spin, how is the motion of quarks and gluons modified inside heavy nuclei, or what determines the onset of new states of matter where gluons dominate. On the other hand, a precise quantitative description of the partonic substructure of nucleons is an essential input for any theoretical predictions for a variety of experiments, from proton–proton scattering at the high-energy frontier at the LHC to the collisions between heavy ions and to the interpretation of the data provided by neutrino and cosmic ray telescopes. In this respect, the determination of the partonic content of hadrons and nuclei offers a bridge between different areas of modern physics and of related disciplines, such as advanced statistics, data analysis, and machine learning.

This article presents a succinct introduction to our modern understanding of the quark, gluon, and photon content of nucleons and nuclei, focusing on recent results and trends. It also outlines and highlights possible future directions for the field of proton structure studies. The article is deliberately nontechnical, and for detailed discussions on the various topics covered here and an extensive survey of the related literature, the reader is encouraged to consult recent topical reviews (Forte & Watt, 2013; Gao, Harland-Lang, & Rojo, 2018; Kovařík, Nadolsky, & Soper, 2020; Rojo et al., 2015).

### 2. From the Quark Model to the Higgs Boson

Pushing forward our understanding of the partonic structure of nucleons and nuclei has been at the forefront of fundamental research in particle and nuclear physics for more than five decades. The story of quarks and gluons started around the early 1930s. By then, both the proton and neutron had been discovered, and thus the structure of atomic nuclei could be explained. At that time, protons and neutrons were assumed to be pointlike particles without further substructure, much like electrons. For a long time, protons and neutrons were the only known hadrons, that is, particles that interacted via the strong nuclear force, responsible in particular for keeping the atomic nucleus bound together.

The situation changed dramatically in the late 1940s with the discovery of neutral and charged pions in cosmic-ray experiments—which were the main toolbox of particle physicists before accelerators became powerful enough. The discovery of the pions was followed by discoveries of a plethora of other strongly interacting particles: kaons, rhos, lambdas, and omegas, just to name a few. Each of these new hadrons was characterized by different masses, electric charges, and spins (the intrinsic angular momenta of quantum particles). Physicists were confused and asked themselves how it was possible to establish some order in this chaos. In other words, what were the underlying laws of nature that determined the properties of the large number of hadrons observed? Furthermore, the new particles did not seem to play any obvious role in the properties of everyday matter, with the possible exception of the pion, which was thought to mediate the strong interaction. In addition, since the 1930s, the measurements of the anomalous magnetic moment of the proton and the neutron suggested that these were not pointlike particles like electrons.

The first breakthrough toward clarifying this confusing situation took place in the 1960s, when Gell-Mann and Zweig separately realized (Gell-Mann, 1964; Zweig, 1964) that the observed regularities in the hadron spectrum could be explained by assuming that they were not fundamental particles, but instead bound states composed of new hypothetical particles named quarks by Gell-Mann. These newly proposed elementary particles were characterized by being pointlike, having a fractional charge of either ± 2/3 or ± 1/3 in units of the electron charge, and half-integer spin. Furthermore, this quark model assumed that quarks existed in three different types or “flavors”: the up, down, and strange quarks, which differed in their masses. It was then possible to show that combining the three quarks in various ways reproduced the quantum numbers of most observed baryons (half-integer spin hadrons, assumed to be composed by three quarks) and mesons (integer-spin hadrons, composed of a quark–antiquark pair).

However, this quark model was purely phenomenological, and in particular did not provide a suitable mechanism to explain why quarks were tightly bound together within the hadrons. Furthermore, while the quark model was successful in describing hadron spectroscopy, it was unable to provide predictions for the observed behavior of strongly interacting particles in high-energy scattering experiments. For these reasons, physicists were for some time skeptical of the concept of quarks: while certainly a useful mathematical framework to describe hadron structure, their actual existence as real elementary particles was far from widely accepted. Such a skeptical position was further strengthened by the fact that free (isolated) quarks had never been observed: even worse, no particles with fractional charge had ever been found in nature.

The road toward the acceptance of quarks as bona fide elementary particles was paved by two momentous discoveries that took place in the late 1960s and early 1970s. The first of those arose in a series of experiments at the Stanford Linear Accelerator Center (SLAC), where energetic electrons where used as projectiles and were fired at protons and neutrons in atomic nuclei. By investigating the properties of the deflected electrons, physicists could probe for the first time the possible substructure of protons, in the same way as Rutherford’s experiments at the beginning of the century had established the existence of the atomic nucleus.

The SLAC measurements of this process (Bloom et al., 1969), known as deep-inelastic scattering (DIS), were consistent with a model for the proton composed of pointlike, noninteracting constituents of fractional electric charge, thereby providing the long-sought experimental evidence for the existence of quarks. Since those groundbreaking experiments, lepton–hadron DIS experiments like the SLAC experiments have provided the backbone for investigations of the partonic structure of nuclei. However, from the theoretical point of view, there were still important hurdles to be overcome before the quark model could be adopted universally. In particular, an explanation was needed for the fact that quarks appeared to be free within the nucleon, while strong interaction scattering processes were characterized in general by higher rates than the electromagnetic and weak ones. The latter properties implied the presence of a large coupling that determined the strength of the interaction, seemingly inconsistent with the SLAC measurements. How was it possible that the same interaction appeared to be either strong or weak depending on the specific process?

The second revolution that led to the modern understanding of the quark substructure of hadrons was provided by the realization in 1973 by Gross, Politzer, and Wilczek that the strong nuclear force could be described in the framework of a renormalizable quantum field theory (Gross & Wilczek, 1973; Politzer, 1973) in the same way as the electromagnetic force was described by quantum electrodynamics (QED). The new theory, calledQCD, was formulated in terms of quark fields interacting via the exchange of force mediators called gluons that transmitted color charges, therefore playing a role analogous to the photon in electromagnetism. A crucial prediction of the new theory was that the strength of the interaction decreased for small distances, or alternatively for high energies, thus explaining why, in high-energy events, quarks appeared to be essentially free particles, while at low energies, where the coupling strength increases, quarks were bound within the hadrons.

Other predictions of QCD were subsequently confirmed: in particular, evidence for the existence of gluons was obtained in electron–positron collisions in the late 1970s. In these events, the gluon appeared as a third stream of collimated hadrons (known as a jet) in addition to the two associated with a quark–antiquark pair (Brandelik et al., 1979). In the following years, the new theory of the strong interaction was also extended with additional heavier quarks, up to a total of 6 different flavors, with the discovery of the charm (1972), bottom (1977), and top (1995) quarks. The charm and bottom quarks have masses around 1.5 and 5 times the proton mass, respectively, and play an important role in the structure of heavy mesons and baryons. The top quark, on the other hand, is sufficiently heavy (around 185 times the proton mass) that it decays before it can form bound states with other quarks.

The combination of the predictive power of the quark model, the formulation of QCD as the quantum field theory of the strong nuclear force, and the results of high-energy experiments, such as SLAC’s DIS, confirmed beyond reasonable doubt that hadrons were bound states composed by quarks and gluons, tightly held together by the nonperturbative phenomena that dominate the strong nuclear force at low energies. It became customary to collectively denote quarks and gluons, as well as any eventual further component of hadrons, as partons. Early on, it was realized that the investigation of the partonic structure of nucleons was going to be a challenging endeavor.1 Indeed, crucial properties of quarks and gluons, such as their contribution to the total momentum and spin of the parent proton, are determined by low-energy nonperturbative QCD dynamics. This nonperturbative character implied that first-principle calculations of the partonic structure of nucleons could not be carried out within the framework of perturbation theory, where the use of Feynman diagrams had lead to striking successes in the case of weakly coupled theories, such as QED.

While perturbative QCD (pQCD) could not be applied to determine important properties of hadron structure due to the dominance of the strong coupling regime, it was realized around the late 1970s that pQCD nevertheless could be used to impose important constraints on the properties. In particular, pQCD allowed the momentum and spin fractions carried by quarks and gluons in the proton to depend on the energy to be evaluated. In other words, if gluons are measured to carry a given fraction of the proton’s momentum at one specific energy, one can then predict how this fraction will be modified for any other energy sufficiently high for perturbation theory to hold. These evolution equations were formulated by Dokshitzer, Gribov, Lipatov, Altarelli, and Parisi (DGLAP; Altarelli & Parisi, 1977; Dokshitzer, 1977; Gribov & Lipatov, 1972) and have played a vital role in validating QCD as the correct theory of the strong force at high energies. Their predictions were spectacularly confirmed by the measurement of hard-scattering observables at the HERA lepton-proton collider (Klein & Yoshida, 2008) in the 1990s.

At the beginning of the studies of the partonic structure of hadrons, it became clear that in order to make progress, two main strategies could be pursued. The first one is based on a first-principles approach, whereby QCD is discretized on a space–time lattice allowing different nonperturbative quantities to be evaluated by means of powerful but CPU-intensive computer simulations. Until recently, the investigation of partonic properties using lattice QCD was, however, limited, due to a number of both conceptual and computational bottlenecks. This approach has experienced significant progress in recent years, as reviewed in the 2017 community white paper (Lin et al., 2018), and it is able now to provide valuable information on the properties of quark and gluons within hadrons.

The second strategy exploits a central property of QCD known as factorization, whereby the total interaction cross section $σ$, which is proportional to the number of events of such type that will take place in a given period of time, can be separated into two independent contributions for processes involving hadrons in the initial state of the collision.2 The first one is the short-distance hard-scattering partonic cross section, calculable using Feynman diagrams in the same way as in QED. The second contribution encodes the information on the proton structure determined by long-distance, nonperturbative dynamics into objects called parton distribution functions (PDFs). The experimental measurements of such cross sections can then be used, by virtue of their factorizable properties, to extract the PDFs and thus achieve powerful insight into the partonic content and properties of protons.

This latter approach is usually known as a global QCD analysis, and it has been successfully deployed in the last three decades to understand in increasing detail the partonic structure of nucleons and nuclei, as is explained in the rest of this article. There are various different types of PDFs; for example, one can consider either spin-dependent or spin-independent PDFs, collinear-integrated PDFs, or transverse-momentum-dependent PDFs. This article focuses on the collinear unpolarized PDFs of nucleons and nuclei, which are, in the majority of the cases, the relevant quantities for the interpretation of the results of high-energy experiments, such as the proton–proton collisions at the LHC. The interested reader can find further information about spin-dependent PDFs in Aidala, Bass, Hasch, and Mallot (2013) and Nocera, Ball, Forte, Ridolfi, and Rojo (2014) and about transverse-momentum-dependent PDFs in Angeles-Martinez et al. (2015) and references therein.

In this article, the following notation is adopted for the collinear unpolarized parton distribution functions of the proton:

$Display mathematics$(1)

where $x$ stands for the fraction of the nucleon’s momentum carried by the $i$-th parton (often denoted as the Bjorken variable) and $Q$ denotes the energy scale (which also corresponds to the inverse of the resolution length and the momentum transferred in the scattering process) at which the nucleon is being probed. In Equation (1), the index $i$ indicates the parton flavor, and $nf$ denotes the number of active partons at the scale $Q$. A quark is typically considered a massless parton when the momentum transfers involved are larger than its mass, $Q1pt∼>mq$; otherwise, it is assumed to be a massive state that does not contribute to the partonic substructure of the proton. For instance, at $Q=10GeV$ one would have $nf=11$, given that there are five quark PDFs (up, down, strange, charm, and bottom) and the corresponding antiquark counterparts, supplemented by the gluon PDF.

At leading order (LO) in the QCD perturbative expansion in the strong coupling $αs$, the PDFs can be interpreted as probability densities. This property implies that $g(x,Q=10GeV)dx$ would correspond to the number of gluons in the proton at a scale of $Q=10GeV$ that carry a momentum fraction between $x$ and $x+dx$, and likewise for other quark flavor combinations. However, it is important to emphasize that this naive probabilistic interpretation is lost when higher-order effects in QCD are taken into account. In particular, it can be shown that PDFs become dependent on the specific manner in which higher-order perturbative corrections are evaluated, and thus cannot be associated to probabilities nor to any specific physical observable.

As mentioned before, the dependence of the PDFs on the momentum fraction $x$ is determined by low-energy nonperturbative QCD dynamics, a regime where the strong coupling becomes large and perturbative calculations are not reliable. Therefore, PDFs need to be extracted from experimental data as follows. Consider the processes depicted in the left panel of Figure 2, where a charged lepton, such as a muon, scatters at high energy off a proton. In such a process, which is of the same family of DIS measurements as those used in the pioneering SLAC experiments, the charged lepton emits a virtual gauge boson (either a photon or a $Z$ boson) which then interacts with one of the quarks in the proton, in the example here, an up quark. The cross section for this processes can be schematically expressed by

$Display mathematics$(2)

where $Q$ is related to the momentum transfer between the muon and the proton and $σ˜γ*u→u$ is the hard partonic cross section for photon-quark scattering that can be computed in perturbation theory using Feynman diagrams. From Equation (2) it can be seen that, if the hadronic cross section $σμp→μX$ is measured and the partonic one $σ˜γ*u→u$ can be evaluated using perturbative QCD calculations, then one can extract the PDF of the up quark $u(x,Q)$ from the data.

At this point it is important to recall that, while the dependence of the PDFs on the momentum fraction $x$ (the Bjorken variable) is indeed nonperturbative, their dependence on the momentum transfer $Q$ can be evaluated in perturbation theory. This can be achieved using the DGLAP evolution equations, which take the schematic form

$Display mathematics$(3)

where $nf$ indicates the number of active partons that participate in the evolution, and $Pij$ are perturbative kernels (the splitting functions) currently known up to $O(αs4)$. Thanks to the DGLAP equations, once the partonic structure of the nucleon has been determined at some low scale, say $Q0≃1GeV$, the behavior of the PDFs can be evaluated for any other scale $Q>Q0$. The splitting functions $Pij(x,αs(Q2))$ model the probability that one of the partons in the proton (e.g., a quark or a gluon) will radiate another parton in the collinear approximation, where the two final-state particles move roughly in the same direction. Such an emission of a collinear quark or gluon will then modify the momentum fraction $x$ carried by the original parton, with a probability that depends on the scale $Q$ at which PDFs are being probed.

In addition to the constraints imposed by DGLAP evolution, there are additional theory considerations that restrict the behavior of selected PDF combinations. In particular, energy-momentum conservation imposes the momentum sum rule,

$Display mathematics$(4)

with $nq$ the number of active quarks with masses $mqi, which follows from the property that the sum of the energies carried by all partons should add up to the total proton energy, while quark flavor number conservation leads to the quark number sum rules,

$Display mathematics$(5)

reflecting the fact that the proton’s wave function contains two up quarks and one down quark. Note that these sum rules should hold for any value of the momentum transfer $Q$, and indeed it can be verified that DGLAP evolution ensures that if they are satisfied at some scale $Q0$, they will also be satisfied for all other scales.

The schematic discussion above provides insight into the global PDF analysis paradigm. By combining a wide range of experimental measurements that involve proton targets with state-of-the-art theory calculations, it becomes possible to determine the quark and gluon substructure of the proton. Figure 3 presents the results of a recent determination of the proton structure, the NNPDF3.0 global analysis (Ball et al., 2015). This comparisons shows the up- and down-quark valence PDFs, defined as $fuV=fu−fu¯$ and $fdV=fd−fd¯$, the sea quark PDFs $fu¯$, $fd¯$, $fs$, $fc$, $fb$, as well as the gluon PDF $fg$ (divided by 10). The PDFs are displayed both at $Q2=10GeV2$ and $104GeV2$ as a function of the partonic momentum fraction, $x$. Recall that the dependence of the PDFs on the scale $Q$ is entirely fixed by the perturbative DGLAP evolution equations. For each of the PDFs shown in Figure 3, the size of the bands provides an estimate of the associated uncertainty.

From Figure 3, it can be seen that the overall shape of the quark valence distributions, $fuV$ and $fdV$, is consistent with the condition that the total number of valence quarks in the proton is two for the up quarks and one for the down quarks. This is a direct consequence of the valence sum rules of Equation (5). Furthermore, it is seen that, at large values of the momentum fraction $x$, the valence quark PDFs are the largest, while at smaller $x$, the proton is dominated by its sea quark and gluon components. Indeed, there is steep rise at small $x$ in the number of gluons and sea quarks, which quickly dominate over the valence distributions, in particular as $Q$ is increased. The valence quarks $fuV$ and $fdV$ are relatively stable, while the gluons and sea quarks rise strongly as $Q2$ rises from $10GeV2$ to $104GeV2$. This behavior implies that the higher the energies at which the internal structure of the proton is probed, the larger its gluonic component will be.

The key property that allows the exploitation of the information contained by the PDFs in different experiments is their universality. Thanks to the factorization theorems of the strong interaction, it is possible to determine the PDFs from a given type of processes, such as lepton–proton scattering, and then use the same PDFs to compute predictions for different types of processes, such as proton–proton collisions. Figure 2 illustrates one important application of this PDF universality: by virtue of the QCD factorization theorems, the proton PDFs can be extracted from DIS measurements and then the same PDFs can be used to evaluate the production cross section for electroweak gauge bosons in proton–proton collisions, the so-called Drell-Yan process. It is important to emphasize that this universality property is highly nontrivial: beyond the leading approximation, PDFs need to reabsorb into their definition higher-order effects that arise from the radiation of extra quarks and gluons. Fortunately, it can be shown that such additional radiation effects are process independent, and thus they do not compromise the universality of the PDFs.

The technical complexities that underlie a global PDF analysis are schematically summarized in Figure 4. It comprises three main types of inputs: experimental measurements, theoretical calculations, and the methodological assumptions, such as those related to the PDF parametrization and the quark-flavor decomposition. This theory input involves higher-order perturbative calculations in both the strong and electroweak interactions for the DGLAP evolution kernels and for the hard-scattering matrix elements. In addition, these higher-order calculations need to be interpolated in the form of fast-access grids (Bertone, Frederix, Frixione, Rojo, & Sutton, 2014; Carli et al., 2010; Wobisch, Britzger, Kluge, Rabbertz, & Stober, 2011) in order to satisfy the requirements of the CPU-time-intensive PDF fits.

The inputs are processed through a fitting program that returns the most likely values of the PDFs and their associated uncertainties. Several methods have been developed to estimate PDF uncertainties, with the Hessian (Pumplin et al., 2001) and the Monte Carlo (Del Debbio, Forte, Latorre, Piccione, & Rojo, 2007) approaches being the most popular. In addition, approximate techniques have been constructed to emulate within certain approximations the results of a full PDF fit in a much faster way, in particular the Bayesian reweighting of Monte Carlo sets (Ball et al., 2012) and the profiling of Hessian sets (Paukkunen & Zurita, 2014). Subsequently, the results of this global PDF analysis are statistically validated and made publicly available, and their phenomenological implications for the LHC and other experiments are studied. It is beyond the scope of this article to describe each of the components listed in Figure 4; for further detail, the reader could consult Gao et al. (2018) and references therein.

Global fits of parton distributions, such as the one displayed in Figure 3, have played a central role in the interpretation of experimental measurements in lepton–hadron and hadron–hadron collisions in the last three decades. As highlighted by Figure 2, any theoretical predictions for processes at colliders that involve protons in the initial state, such as HERA and the LHC, necessarily require PDFs as input. Parton distributions in particular were one of the theoretical inputs that contributed to the discovery of the Higgs boson in 2012 by the ATLAS and CMS experiments (Aad et al., 2012; Chatrchyan et al., 2012), recognized with the Nobel Prize in Physics in 2013, and which heralded a new era for elementary particle physics. The Higgs boson can be rightly qualified as the most spectacular particle ever encountered. First of all, it is the only known particle that couples to anything that has mass. Second, it transmits a new type of fundamental force, which is completely different from all other interactions such as electromagnetism. Third, it is exquisitely sensitive to quantum effects taking place at the tiniest of the distances. Nowadays, improving our understanding of the quark and gluon structure of the proton makes it possible to scrutinize the Higgs sector of the Standard Model in greater detail, and indeed several recent developments in the field of PDF fits aim to strengthen the robustness of theoretical predictions for Higgs-boson production at the LHC (de Florian et al., 2016).

So far, the discussion in this article is restricted to the partonic structure of free protons. A closely related field of research focuses on the study of the partonic structure of nucleons (protons and neutrons) that are bound within heavy nuclei. The quark and gluon structure of bound nucleons is parametrized by the so-called nuclear parton distribution functions (nPDFs). It was discovered by the EMC experiment in the 1980s (Aubert et al., 1983), to great surprise, that the parton distributions of bound nucleons were significantly different from those of their free-nucleon counterparts. Such behavior was quite unexpected, since nuclear binding effects are characterized by MeV-scale momentum transfers, which should be negligible when compared to the typical momentum transfers ($Q1pt∼>1GeV$) involved in DIS.

In addition to this suppression of the nPDFs at $x≃0.4$ (the EMC effect), other nuclear modifications that have been reported include an enhancement at large $x$ (Fermi motion) and a further suppression in the small-$x$ region known as nuclear shadowing. The left panel of Figure 5 displays a schematic representation of how different nuclear effects modify the PDFs of nucleons bound within heavy nuclei as compared to those of free nucleons as a function of $x$. Note that in such comparisons the limit $RA=1$ corresponds to the absence of nuclear corrections.

From the conceptual and methodological points of view, global fits of nPDFs proceed in a similar way as those of the free nucleon PDFs, and they are summarized in Figure 4. This said, there are two important differences compared to the free proton case. First, the available experimental constraints are greatly reduced in the nuclear case, with a restricted kinematical coverage both in $x$ and in $Q2$. Second, the fitting methodology must be extended to also explain the dependence on the nuclear mass number $A$. This implies that nPDFs will depend on three variables, $fi(A)(x,Q2,A)$, two of which ($x$ and $A$) have a nonperturbative origin and thus need to be parameterized and extracted from data. Several groups have presented fits of nPDFs with varying input data sets, theory calculations, and methodological assumptions. The right plot of Figure 5 displays a comparison of the nuclear modifications for the $Σ+T8/4$ quark combination in copper ($A=64$) at $Q=10GeV2$ between the nNNPDF1.0 (Abdul Khalek, Ethier, & Rojo, 2019d), nCTEQ15 (Kovarik et al., 2016), and EPPS16 (Eskola, Paakkinen, Paukkunen, & Salgado, 2017) nPDF sets, where the bands indicate the associated uncertainties.3 While there is reasonable agreement in the central values, there are marked differences in the uncertainty estimates, in particular in the small- and large-$x$ regions.

In addition to shedding light on the inner workings of the strong force in the nuclear environment, the accurate determination of nPDFs represents an important ingredient for the interpretation of the results of the heavy-ion programs of RHIC and the LHC, where heavy nuclei, such as lead ($A=208$), are collided. In such heavy-ion collisions, it becomes possible to study the properties of the quark–gluon plasma (QGP), the hot and dense medium created in the early universe and that now can be replicated in the laboratory. Whenever hard probes such as jets, weak bosons, or heavy quarks are produced, nPDFs enter the initial state of heavy-ion collisions (Abreu et al., 2008). Therefore, improving the understanding of nPDFs is important in order to discriminate between the cold and hot nuclear-matter effects in those complex events, involving hundreds or even thousands of produced particles.

Next, the article presents the current state of (n)PDF determinations, emphasizing recent breakthroughs and important results in the field.

### 3. The Partonic Structure of Hadrons in the LHC Era

Here, the main components of state-of-the-art global analyses of the quark and gluon substructure of nucleons and nuclei are presented and described.

#### 3.1 Global PDF Fit

Several collaborations provide regular updates on their global proton PDF analyses, including ABMP (Alekhin, Blümlein, Moch, & Placakyte, 2017), CT (Hou et al., 2019), MMHT (Harland-Lang, Martin, Motylinski, & Thorne, 2015), and NNPDF (Ball et al., 2017). The differences among them stem from several of the aspects that define a PDF fit (see Figure 4). To begin with, the input data sets are not the same; for example, ABMP does not include jet measurements, and not all groups include the same neutrino DIS experiments. Second, theoretical calculations can differ—for instance, part of the disagreement between ABMP and other groups can be traced back to the treatment of the heavy-quark corrections in DIS structure functions. Furthermore, the treatment of the Standard Model inputs in the PDF analysis is not homogeneous: while CT and NNPDF take external parameters such as the strong coupling constant $αs(mZ)$ and the charm and bottom-quark masses $mc$ and $mb$ from their Particle Data Group averages (Patrignani et al., 2016), the ABMP and MMHT groups determine these parameters instead simultaneously with the PDFs.

The third, and perhaps most important, source of differences between the PDF fitting groups is related to methodological assumptions, such as the way to parametrize PDFs and to estimate the associated uncertainties. Here there are two main groups: ABMP, CT, and MMHT are based on the Hessian method (with or without tolerances) and a polynomial parametrization, while NNPDF adopts the Monte Carlo method with artificial neural networks as universal unbiased interpolants. Figure 6 compares the gluon and the down antiquark PDFs at $Q=100GeV$ between the ABMP16, MMHT14, CT14, and NNPDF3.1 fits, normalized to the central value of the latter. For each fit, the bands represent the 68% confidence level uncertainties. Similar comparison plots (also for related quantities, such as PDF luminosities) can be straightforwardly produced by means of the APFEL-Web online PDF plotter (Carrazza, Ferrara, Palazzo, & Rojo, 2015a).

#### 3.2 Impact of LHC Data

Traditionally, hadron colliders were considered the domain of discovery physics, while the cleaner lepton colliders had the monopoly of precision physics. However, the outstanding performance of the LHC and its experiments, complemented by the recent progress in higher-order QCD and electroweak calculations, have demonstrated that the LHC is able to pursue a vigorous and exciting precision physics program. In the context of studies of the partonic content of nucleons and nuclei, this precision program aims to exploit the information contained in LHC measurements to pin down the quark and gluon PDFs and to reduce their uncertainties. In turn, this should lead to improved theory predictions for many important processes at the LHC and elsewhere, ranging from Higgs production to dark matter searches.

Until around 2012, all PDF fits were restricted to DIS measurements (from both fixed-target experiments and from the HERA collider), fixed-target Drell-Yan cross sections, and some jet production measurements from the Tevatron. Fortunately, the situation has improved dramatically in the recent years. A large number of LHC processes are now staple components of global PDF analyses: inclusive and differential Drell-Yan measurements, the transverse momentum of $W$ and $Z$ bosons, top-quark pair and single-top production, and heavy meson production, among others. In most cases, state-of-the-art next-to-next-to-leading order (NNLO) QCD calculations are available, allowing fully consistent NNLO PDF fits based on a the widest possible range of fixed-target and collider measurements to be carried out.

In order to illustrate the breadth of collider measurements included in a typical modern PDF fit, the left panel of Figure 7 displays the kinematic coverage in the $(x,Q)$ plane of the experimental data included in a recent CT analysis (Hobbs, Wang, Nadolsky, & Olness, 2019). One can observed that this coverage extends down to x; 5 × 10–5 and up to several TeV in $Q$, and furthermore that in most cases several processes constrain the PDFs in the same region of the $(x,Q)$ plane. Around half of the data sets in this fit correspond to LHC measurements. One important advantage of this rich data set is the redundancy it provides, meaning that a given PDF will typically be constrained by several data sets, resulting in a significantly more robust PDF extraction. This important property is illustrated in the case of the large-x gluon in the right panel of Figure 7, which displays the results from the global NNPDF3.1 NNLO fit compared with the corresponding results in fits without any jet, top quark, or $ZpT$ measurements included. In all cases, the resultant gluon PDFs can be seen to be consistent within uncertainties, highlighting the complementarity of the various gluon-sensitive measurements provided by the LHC.

#### 3.3 PDFs With Theory Uncertainties

An aspect of the global PDF fitting machinery that has attracted a lot of attention is the estimate and propagation of the associated uncertainties. In most PDF fits, the uncertainty estimate accounts only for the propagation of the uncertainties corresponding to the input experimental measurements, as well as for methodological components associated with, for example, the choice of functional form. PDF analyses, however, do not account in general for the theory uncertainties associated with the truncation of the perturbative expansions of the input QCD calculations. While neglecting these theory errors might have been justified in the past, the rapid pace of improvements in PDF fits, leading to ever smaller uncertainties, suggested that such assumption should be revisited.

An important recent development in global PDF analyses has been the formulation of frameworks to systematically include theoretical uncertainties, in particular those associated with the missing higher orders (MHOs) in the perturbative QCD calculations used as input to the fit. The basic idea of one method (Abdul Khalek et al., 2019b, 2019c) is constructing a combined covariance matrix that includes both experimental and theoretical components, with the latter estimated using the scale-variation method and validated whenever possible with existing higher-order calculations. It is important to mention that other approaches to estimate the impact of MHOs in PDF fits have been presented elsewhere, for example (Harland-Lang & Thorne, 2019) by proposing a consistent use of scale variations in PDF fits and the corresponding physical predictions. More research is needed on this important topic, particularly if the scale-variation approach turns out not to be the optimal strategy to estimate the impact of MHOs in PDF fits.

One of the main consequences of the covariance matrix approach developed by the NNPDF group, beyond an overall increase of the total uncertainties, is that theory-induced correlations now connect different measurements, such as DIS and Drell-Yan or top-quark production, which are experimentally uncorrelated. This feature is illustrated by the left panel of Figure 8, which displays the combined experimental and theoretical correlation matrix for the data set used in the NNPDF3.1 NLO analysis. This shows, for example, that theory uncertainties introduce a positive correlation between DIS and Drell-Yan in some kinematic regions, and a negative one in others.

This formalism has been used to carry out an updated version of the NNPDF3.1 NLO global analysis, now accounting for the MHOs. The right panel of Figure 8 displays the results of the NNPDF3.1 NLO fit with theory uncertainties (red hatched band) for the gluon PDF at $Q=10GeV$, compared with a baseline with only experimental errors in the covariance matrix (green solid band) and with the central value of the corresponding NNLO fit. There are two main consequences of including the theory uncertainties in the fit covariance matrix. First, there is a moderate increase of the PDF uncertainties, showing that theory errors cannot be neglected when estimating the total PDF uncertainties, at least at NLO. Second, there is a shift in the central values due to the rebalancing effect of theory errors and their correlations, which in general is found to shift the fit results toward the corresponding NNLO result. The latter is a desirable feature of any method that estimates theory uncertainties, whose ultimate goal is to accurately predict the results of the next perturbative order.

#### 3.4 The Photon PDF

A perhaps unexpected observation whose implications were only fully appreciated recently is related to the fact that the partonic content of the proton should be composed not only of quarks and gluons, but also of photons. Indeed, it can be demonstrated that protons radiate quasi-real photons that induce photon-initiated scattering reactions in lepton–proton and in proton–proton collisions, and that these reactions can be described as arising from a photon PDF $fγ(x,Q2)$. This photon PDF receives two types of contributions, as schematically indicated in the upper left panel of Figure 9: from elastic processes, where the proton is left intact, and from inelastic processes, where the proton breaks up after the scattering.

The presence of a photon component in the proton has two main phenomenological implications. First, the photon mixes with the quark and gluon PDFs via QED effects in the DGLAP evolution, and thus modifies the results of the latter as compared to the QCD-only case. In particular, the photon subtracts from the quarks and gluons a small amount of the total momentum of the proton, which is currently estimated to amount to a few permille. Second, photon-initiated contributions result in the opening of new channels for important hard-scattering processes. An important example of this is shown in the bottom left panel of Figure 9: photon-initiated processes contribute to the total Drell-Yan cross section (lepton pair production) on the same footing as the quark–antiquark annihilation reactions.

Until recently, the photon PDF was determined either from model assumptions (Martin, Roberts, Stirling, & Thorne, 2005; Schmidt, Pumplin, Stump, & Yuan, 2016) or freely parametrized and constrained by experimental data (Ball et al., 2013). None of these options was satisfactory, the former due to the bias associated with the choice of model, the latter since the results were affected by large uncertainties due to limited available constraints. A major breakthrough was then the demonstration that the photon content of the proton need neither to be modeled nor to be fitted from data, but can be evaluated from first principles by expressing it in terms of the well-measured inclusive DIS structure functions $F2$ and $FL$. This result was originally formulated by means the equivalent photon approximation (Harland-Lang, Khoze, & Ryskin, 2016; Martin & Ryskin, 2014), and subsequently was placed on a more rigorous footing by the LUXqed formalism developed by Manohar, Nason, Salam, and Zanderighi (2016, 2017).

Since these pioneering analyses were presented, other global PDF fitting groups have provided QED variants of their PDF sets that include both QED corrections in the DGLAP evolution with a photon PDF determined by means of the LUXqed calculation or variations thereof (Bertone, Carrazza, Hartland, & Rojo, 2018; Harland-Lang, Martin, Nathvani, & Thorne, 2019). Given the relatively tight constraints imposed by the LUXqed framework, the resulting photon PDFs turn out to be quite similar. To illustrate this, the right panel of Figure 9 displays a comparison of the photon PDFs $fγ(x,Q)$ at $Q=100GeV$ between the NNPDF and MMHT QED analyses, normalized to the central value of the latter. The two photon PDFs agree at the few percent level in most of the relevant $x$ range, with the exception of the large-$x$ regions, where differences can be as large as 15%.

#### 3.5 Implications for Astroparticle Physics

Another topic that has received considerable attention in recent years has been the interplay between proton structure studies in nuclear and particle physics with those in astroparticle physics. This interest was motivated by the realization that improving our understanding of the partonic content of nucleons and nuclei was an important ingredient for the theoretical predictions relevant for the interpretation of high-energy astrophysics experiments. This connection is particularly important for neutrino telescopes, such as IceCube and KM3NET, which instrument large volumes of ice and water as effective detectors of energetic neutrinos, as well as for cosmic ray detectors, such as Auger, which aim to detect the most energetic particles in the universe.

In the specific case of high-energy neutrino astronomy, QCD calculations are required for two different aspects of the data interpretation. First, to predict the expected event rates of neutrino–nucleus interactions at high energies (upper left panel of Figure 10) via the neutral-current and charged-current DIS processes. Knowledge of these cross sections is required both to predict the number of neutrinos that will interact with the nuclei in the ice or water targets, as well as the attenuation of the incoming neutrino flux as they traverse through the Earth (Vincent, Argüelles, & Kheirandish, 2017). As indicated in the bottom left panel of Figure 10, for a neutrino with an energy of $Eν=5×1010GeV$, the charged-current DIS process probes the proton PDFs down to $x≃10−8$ (Cooper-Sarkar, Mertsch, & Sarkar, 2011), a region far beyond present experimental constraints. The interpretation of the measurements of neutrino telescopes could thus be hindered by theoretical uncertainties in QCD and the partonic content of protons.

Second, QCD processes are also relevant for the prediction of the dominant background for neutrino astronomy, the so-called prompt neutrino flux. In this process, represented in the upper right panel of Figure 10, an energetic cosmic ray, typically a proton, collides with an air nucleus and produces a charmed meson, which promptly decays into neutrinos. At high energies, the flux of such neutrinos becomes larger than those from pion and kaon decays, since the short lifetime of $D$ mesons implies that their energy is not attenuated before decay. As in the case of the neutrino cross sections, the calculation of the prompt neutrino flux involves knowledge of the proton structure for $Q≃mc=1.5GeV$ down to $x≃10−6$ (Zenaiev et al., 2015), where theoretical uncertainties are large.

The key ingredient that allows solid theoretical predictions to be made for both types of processes is the exploitation of collider measurements sensitive to the small-$x$ structure of the proton, such as the charm production data in the forward region from the LHCb experiment. Forward $D$ meson production at LHCb, when Lorentz-boosted to the center-of-mass frame, covers the same kinematic region as that of charm production in energetic cosmic ray collisions, and therefore makes it possible to pin down the gluon distribution at small-$x$ beyond the reach of the HERA collider data and in the kinematical region of relevance for neutrino astronomy. The bottom right of Figure 10 displays the significant reduction in the PDF uncertainties of the small-$x$ gluon once the LHCb charm production measurements at $s=5,7$ and 13 TeV are included in the NNPDF3.0 fit (Gauld & Rojo, 2017). Thanks to this connection with LHC hard-scattering processes, it has become possible to provide robust state-of-the-art predictions for the signal (Bertone, Gauld, & Rojo, 2019) and background (Bhattacharya et al., 2016) processes at neutrino telescopes with reduced theoretical uncertainties. In the future, a situation can be envisaged where measurements at high-energy astroparticle-physics experiments can be used to constrain the small-$x$ structure of the nucleon and the associated QCD dynamics in this regime.

#### 3.6 PDFs and BSM Searches

As more data are collected and the precision of the LHC measurements improves, the information on the partonic structure of the proton that can be obtained from them will consequently increase. In particular, the large integrated luminosities that will be accumulated in future runs of the LHC imply that extended measurements in the TeV region should make possible a more precise determination of poorly known PDFs, such as the large-$x$ sea quark and the gluon distributions (see also the left panel of Figure 7). A potential worry in using LHC data in this high-energy region in PDF fits is that the fits could be biased if deviations with respect to the Standard Model were present. This concern is becoming more important since the lack of new particles or interactions detected so far at the LHC might suggest the presence of a band gap between the electroweak scale and the scale of new physics $Λ$. In turn, such a band gap could imply that Beyond the Standard Model (BSM) dynamics could very well manifest itself at the LHC only via subtle deviations in the tails of the measured distributions.

A powerful framework to interpret the results of the LHC in a model-independent manner is provided by the Standard Model Effective Field Theory (SMEFT; Brivio & Trott, 2019). Within the mathematical language of the SMEFT, the effects of BSM dynamics at high energies $Λ≫v$ above the electroweak scale ($v=246GeV$) can be parametrized at lower energies, $E≪Λ$, in terms of higher-dimensional operators built up from Standard Model fields and satisfying its symmetries, such as gauge invariance. Since several of the processes that are used at the LHC to constrain the SMEFT degrees of freedom are also used as inputs to the PDF determinations, it is important to assess their interplay and to establish whether one could use the global PDF fits to disentangle BSM effects from QCD dynamics, as originally proposed by Berger et al. (2010).

A first study in this direction was recently presented by Carrazza, Degrande, Iranipour, Rojo, and Ubiali (2019), where variants of the NNPDF3.1 DIS-only fit were carried out using theory calculations where the Standard Model had been extended by specific subsets of dimension-6 four-fermion SMEFT operators. The left panel of Figure 11 shows the results of the gluon PDF at large-$x$ for $Q=10GeV$ in those fits. For benchmark points in the parameter space not already excluded by other experiments, the shift due to SMEFT corrections in the theory calculation is at most half of the PDF uncertainty. Crucially, one can exploit how the value of the fit-quality $χ2$ changes with the energy of the process to disentangle QCD effects from genuine BSM dynamics, as shown in the right panel of Figure 11: the former are smooth as the energy increases, since DGLAP evolution effects are logarithmic in $Q2$, while the EFT corrections are much more marked since they scale as a power of $Q2$. While this study found that SMEFT-induced distortions are subdominant with respect to PDF uncertainties, it was restricted to DIS measurements and the picture could change significantly in the case of LHC measurements, especially the high-statistics one from future LHC runs.

#### 3.7 Constraints on nPDFs From LHC Data

Concerning nPDFs, perhaps the most important recent development has been the availability of a wide variety of hard probe measurements from proton–lead collisions at the LHC. Similar to the proton PDF case, these measurements provide new and valuable constraints on various nPDF combinations, in particular for those that can only be loosely constrained from DIS data, such as the gluon and specific quark flavor PDFs. The information on the nuclear modifications of the gluon PDF is particularly valuable, since these are essentially unconstrained if only DIS data are used in the nPDF fit. Furthermore, these hard-scattering LHC measurements offer novel opportunities to test the validity of both the nPDF universality and of the QCD factorization properties in the nuclear environment. Of specific interest is the small-x regime, where eventual nonlinear saturation dynamics are expected to be enhanced as compared to the free-nucleon case.

LHC measurements from proton–lead collisions that have demonstrated their constraining power for the determination of nPDFs include dijet production (sensitive to the gluon; Eskola, Paakkinen, & Paukkunen, 2019), $D$ meson production in the forward region from LHCb (to pin down the small-x gluon nPDF shadowing; Kusina, Lansberg, Schienbein, & Shao, 2018), and electroweak gauge boson production (providing a handle on the different quark flavor distributions). Figure 12 illustrates the information on the nPDFs provided by hard probes in proton–lead collisions from the LHC. In the left panel, the nuclear ratio $RgPb$ for gluons in lead at $Q=100GeV$ is shown in the EPPS16 analysis before and after including the constraints from the CMS dijet measurements at $s=5.02$ TeV via Hessian profiling. One can observe how these data reduce the uncertainties of $RgPb$ for a wide range of $x$ values, and in particular they appear to suggest the presence of gluon shadowing at small-x. Then the right panel of Figure 12 displays the forward-backward asymmetry in $W$ production in proton–lead collisions from CMS, compared with the nCTEQ15 predictions before and after including this measurement in their fit. The agreement between data and theory is clearly improved once the data on $AFB$ are added to the analysis, with $χ2/ndat$ decreasing from 4.03 to 1.31; also, the uncertainties in the theory prediction (which arise from the flavor decomposition of the quark nPDF) are similarly reduced.

### 4. Summary and Outlook

As discussed throughout this article, the study of the partonic structure of nucleons and nuclei represents an exciting and rich field at the crossroads of different aspects of particle, nuclear, and astroparticle physics. On the one hand, it allows us to shed light on the dynamics of the strong nuclear force, providing crucial input on pressing questions like the origin of the mass and the spin of protons, the possible onset of new gluon-dominated regimes at small-x, the strange and heavy quark contribution to the nucleon’s wave function, and the dynamics of quark and gluons for nucleons bound within heavy nuclei. On the other hand, it provides essential input for the theoretical predictions in processes as diverse as the production of Higgs bosons at the LHC, the interactions of high-energy neutrinos at IceCube, and the collisions between lead nuclei at the RHIC.

In the recent years, the field has witnessed a number of important results that highlight its vitality and productivity. Just a few of the results include: the demonstration of the impact on the proton structure of precision LHC measurements; the improved determinations of the strange, charm, and photon content of the proton; the first evidence for BFKL small-$x$ dynamics in HERA data (Ball et al., 2018); the calculation of x-space distributions with lattice QCD; the formulation of new frameworks to estimate and propagate theory uncertainties; the establishment of the connection with neutrino telescopes and cosmic ray physics; and the analysis of the interplay between proton structure and direct and indirect searches for New Physics at the high-energy frontier.

Many of these achievements in the understanding of the partonic structure of nucleons and nuclei have only become possible thanks to progress from the methodological side, from the development of new machine learning algorithms to parameterize and train the PDFs (Carrazza & Cruz-Martinez, 2019) to strategies to combine and compress different sets of parton distributions (Carrazza, Latorre, Rojo, & Watt, 2015b; Gao & Nadolsky, 2014) and new methods to quantify and represent graphically the information provided by individual data sets (Wang et al., 2018). Nevertheless, this list is necessarily incomplete due to the space restrictions of this article, and the interested reader is encouraged to turn to the more extended technical reviews mentioned in the introduction.

While great progress has been made in addressing long-standing questions in our knowledge of the partonic structure of the proton, important questions still remain that should be addressed in the coming years. A major bottleneck in this respect will be to resolve some apparent incompatibilities that affect current PDF interpretations of high-precision LHC measurements for processes such as jet, top quark, and gauge boson production. With statistical uncertainties at the few permille level in many cases, confronting theory calculations with the experimental data requires dealing with several hitherto ignored effects, such as the dependence of the results on the experimental correlation model and reported tensions between different processes and different distributions within the same process. Addressing and overcoming these issues will be crucial to maximally exploit the information contained in LHC data for precision PDF determinations.

To conclude, two possible future directions for the field are highlighted. The first one focuses on the exploitation of the constraints that will be provided by PDF-sensitive measurements at future facilities, some of them already approved, such as the High-Luminosity LHC, and some others still under discussion, such as the Electron Ion Collider (EIC) or the Large Hadron electron Collider (LHeC). The second direction points toward the unification of different aspects of the global QCD analysis paradigm into a unified framework that combines them in a fully consistent way.

In order to further improve understanding of the proton structure, several future and proposed facilities will play a crucial role. The high-luminosity upgrade of the LHC (HL-LHC), which will operate between 2027 and the late 2030s, will deliver a total integrated luminosity of more than $L=3ab−1$ for ATLAS and CMS and $L=0.3ab−1$ for LHCb. This large data set will provide ample opportunities for PDF studies, in particular with the measurements of cross sections in the few TeV region for processes such as dijet, top quark pair, direct photon, and Drell-Yan production, or the transverse momentum distributions of weak gauge bosons. These measurements would constrain the large-x behavior of the poorly known gluon and sea quarks, which in turn would lead to improved searches for new heavy particles predicted in scenarios of New Physics beyond the Standard Model (Beenakker et al., 2016).

One proposed future facility that would provide unique input for both the proton and nuclear PDFs would be the LHeC (Abelleira Fernandez et al., 2012). The LHeC would extend the kinematical coverage achieved at HERA by more than one order of magnitude at small-$x$ and at large-$Q2$, while operating both with proton and with light and heavy nuclear beams. The upper panels of Figure 13 display the kinematic coverage in the $(x,Q2)$ plane of the HL-LHC and the LHeC. This comparison highlights their complementary results, with the LHeC providing a superior handle in the small-x region (and in particular allowing for tests of novel QCD dynamics) with the HL-LHC offering unparalleled reach in the high-energy frontier. The bottom left panel of Figure 13 displays the expected reduction in the PDF uncertainties of the gluon-gluon luminosity at the LHC (14 TeV) based on the LHeC and HL-LHC pseudo-data projections presented in Abdul Khalek, Bailey, Gao, Harland-Lang, and Rojo (2018, 2019a). In these forecasts, the baseline is taken to be the PDF4LHC15 set (Butterworth et al., 2016), which represents current best knowledge of the proton PDFs. One can observe that in the most favorable scenario, where both HL-LHC and LHeC operate simultaneously, one might achieve an uncertainty reduction by up to an order to magnitude for several values of the final state invariant mass $MX$.

The EIC (Accardi et al., 2016), a recently approved U.S.-based proposal, would also be able to scrutinize the properties of nucleons and nuclei with unprecedented detail. The bottom right panel of Figure 13 shows the improvement in the determination of the nuclear gluon PDF expected at the EIC for two scenarios for its center-of-mass energy in the case of the nNNPDF1.0 analysis. These projections indicate that the EIC could pin down the nuclear gluon modifications down to $x≃10−4$ and would thus allow one to probe the possible onset of new QCD dynamics, such as nonlinear (saturation) effects. Furthermore, the EIC would have the unique feature of being able to chart the proton spin structure in the small-x regime, as well as its three-dimensional structure for the first time in a wide kinematic range.

As mentioned before, another possible direction in which one should expect future progress in the field of nucleon structure is in the combination of individual QCD fits. This can be illustrated with a specific example: Several proton PDF fits include data taken on nuclear targets; however, they neglect to account for the effects of nuclear modifications. Furthermore, most nuclear PDF fits assume a proton PDF baseline (and its uncertainties) that has been extracted by other groups with generally different methodologies and input assumptions. Such a situation is far from optimal, since the assumption that the extraction of proton and nuclear PDFs can be decoupled from each other is not justified anymore, given recent progress in experimental data, theory calculations, and methodological developments. Indeed, a more robust approach would extract simultaneously the free nucleon ($A=1$) and the nuclear ($A>1$) PDFs from a single QCD analysis, and thus be able to keep track of their mutual interplay.

Similar considerations apply to other aspects of the QCD fitting paradigm. For instance, there is a natural cross-talk between the (un)polarized PDFs and the hadron fragmentation functions, which are connected by means of the semi-inclusive DIS (SIDIS) processes. SIDIS differs from standard DIS because one of the hadrons in the final state has been identified. This implies that the cross section for this process depends both on the PDFs of the initial state proton as well as on the fragmentation functions of the final state hadron. Therefore, one should strive to determine simultaneously the (un)polarized PDFs and the fragmentation functions from the same joint QCD analysis, as done for example in Ethier, Sato, and Melnitchouk (2017) and Sato, Andres, Ethier, and Melnitchouk (2020). This approach can provide information on partonic properties that is not available via other channels, for instance to constrain the strange content of the proton from unpolarized SIDIS measurements (Borsa, Sassot, & Stratmann, 2017; Sato et al., 2020).

In the long term, the community should aim to assemble a truly global analysis of nonperturbative QCD objects to extract simultaneously the unpolarized and polarized proton PDFs, the nPDFs, and the hadron fragmentation functions. However, there are two main requirements that need to be satisfied in order to be able to reach such an ambitious goal. The first is methodological progress in the fitting side that ensures that the large parameter space that arises once all QCD objects are jointly extracted can be explored efficiently. The second is the availability of new facilities that can provide suitable experimental measurements to constrain all the relevant nonperturbative QCD objects and their correlations. In this context, a machine like the EIC would be particularly suited for the realization of this ultimate “integrated” global QCD analysis.

To summarize, the investigation of the partonic content of nucleons and nuclei represents a thriving research field that allows us to tackle pressing open questions in particle, nuclear, and astroparticle physics. As discussed here, several recent breakthroughs have made possible a greatly improved understanding of the quark and gluon substructure of nucleons and nuclei. While several challenges lie ahead, one can be confident predicted that ongoing progress from theory, methodology, and experiment will make possible bypassing them and pushing forward the frontier of our studies of the quark and gluon parton distributions.

### Acknowledgments

The author has been supported by the European Research Council Starting Grant PDF4BSM as well as by the Netherlands Organization for Scientific Research (NWO).

#### References

• Aad, G., Abajyan, T., Abbott, B., Abdallah, J., Abdel Khalek, S., Abdelalim, A. A., Abdinov, O., Aben, R. . . . Zwalinski, L. (2012). Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Physics Letters B, 716, 1–29.
• Abdul Khalek, R., Bailey, S., Gao, J., Harland-Lang, L., & Rojo, J. (2018). Towards ultimate parton distributions at the high-luminosity LHC. European Physical Journal C, 78(11), 962.
• Abdul Khalek, R., Bailey, S., Gao, J., Harland-Lang, L., & Rojo, J. (2019a). Probing proton structure at the large hadron electron collider. SciPost Physics, 7(4), 051.
• Abdul Khalek, R., Ball, R. D., Carrazza, S., Forte, S., Giani, T., Kassabov, Z. . . . Wilson, M. (2019b). A first determination of parton distributions with theoretical uncertainties. European Physical Journal C, 79, 838.
• Abdul Khalek, R., Ball, R. D., Carrazza, S., Forte, S., Giani, T., Kassabov, Z. . . . Wilson, M. (2019c). Parton distributions with theory uncertainties: General formalism and first phenomenological studies. European Physical Journal C,79(11), 931.
• Abdul Khalek, R., Ethier, J. J., & Rojo, J. (2019d). Nuclear parton distributions from lepton-nucleus scattering and the impact of an electron-ion collider. European Physical Journal C, 79(6), 471.
• Abelleira Fernandez, J., Adolphsen, C., Akay, A. N., Aksakal, H., Albacete, J. L., Alekhin, S. (2012). A large hadron electron collider at CERN: Report on the physics and design concepts for machine and detector. Journal of Physics G, 39, 075001.
• Armesto, N., Borghini, N., Jeon, S., Wiedemann, U. A., Abreu, S., Akkelin, S. V. . . . Antonov, D. (2008). Proceedings, workshop on heavy ion collisions at the LHC: Last call for predictions. Journal of Physics G, 35, 054001.
• Accardi, A., Albacete, J. L., Anselmino, M., Armesto, N., Aschenauer, E. C., Bacchetta, A. . . . Zheng, L. (2016). Electron ion collider: The next QCD frontier. European Physical Journal A, 52(9), 268.
• Aidala, C. A., Bass, S. D., Hasch, D., & Mallot, G. K. (2013). The spin structure of the nucleon. Reviews of Modern Physics, 85, 655–691.
• Alekhin, S., Blümlein, J., Moch, S., & Placakyte, R. (2017). Parton distribution functions, $αs$, and heavy-quark masses for LHC Run II. Physical Review D, 96(1), 014011.
• Altarelli, G., & Parisi, G. (1977). Asymptotic freedom in parton language. Nuclear Physics B, 126, 298.
• Angeles-Martinez, R., Bacchetta, A., Balitsky, I. I., Boer, D., Boglione, M., Boussarie, R. . . . Wallon, S. (2015). Transverse momentum dependent (TMD) parton distribution functions: Status and prospects. Acta Physica Polonica B, 46(12), 2501–2534.
• Aubert, J. J., Bassompierre, G., Becks, K. H., Best, C., Böhm, E., de Bouard, X. . . . Wimpenny, S. J. (1983). The ratio of the nucleon structure functions $F2n$ for iron and deuterium. Physics Letters B, 123, 275–278.
• Ball, R. D., Bertone, V., Bonvini, M., Marzani, S., Rojo, J., & Rottoli, L. (2018). Parton distributions with small-x resummation: Evidence for BFKL dynamics in HERA data. European Physical Journal C,78(4), 321.
• Ball, R. D., Bertone, V., Cerutti, F., Del Debbio, L., Forte, S., Guffanti, A. . . . Ubiali, M. (2012). Reweighting and unweighting of parton distributions and the LHC W lepton asymmetry data. Nuclear Physics B, 855, 608–638.
• Ball, R. D., Bertone, V., Carrazza, S., Del Debbio, L., Forte, S., Guffanti, A. . . . Rojo, J. (2013). Parton distributions with QED corrections. Nuclear Physics B, 877, 290–320.
• Ball, R. D., Bertone, V., Carrazza, S., Deans, C. S., Del Debbio, L., Forte, S. . . . Ubiali, M. (2015). Parton distributions for the LHC Run II. Journal of High Energy Physics, 4, 040.
• Ball, R. D., Bertone, V., Carrazza, S., Del Debbio, L., Forte, S., Groth-Merrild, P. . . . Ubiali, M. (2017). Parton distributions from high-precision collider data. European Physical Journal C, 77(10), 663.
• Beenakker, W., Borschensky, C., Krämer, M., Kulesza, A., Laenen, E., Marzani, S., & Rojo, J. (2016). NLO+NLL squark and gluino production cross-sections with threshold-improved parton distributions. European Physical Journal C,76(2), 53.
• Berger, E. L., Guzzi, M., Lai, H.-L., Nadolsky, P. M., & Olness, F. I. (2010). Constraints on color-octet fermions from a global parton distribution analysis. Physical Review D, 82, 114023.
• Bertone, V., Carrazza, S., Hartland, N. P., & Rojo, J. (2018). Illuminating the photon content of the proton within a global PDF analysis. SciPost Physics, 5(1), 008.
• Bertone, V., Frederix, R., Frixione, S., Rojo, J., & Sutton, M. (2014). aMCfast: Automation of fast NLO computations for PDF fits. Journal of High Energy Physics, 1408, 166.
• Bertone, V., Gauld, R., & Rojo, J. (2019). Neutrino telescopes as QCD microscopes. Journal of High Energy Physics, 1, 217.
• Bhattacharya, A., Enberg, R., Jeong, Y. S., Kim, C. S., Reno, M. H., Sarcevic, I., & Stasto, A. (2016). Prompt atmospheric neutrino fluxes: Perturbative QCD models and nuclear effects. Journal of High Energy Physics, 11, 167.
• Bloom, E. D., Coward, D. H., DeStaebler, H., Drees, J., Miller, G., Mo, L. W. . . . Kendall, H. W. (1969). High-energy inelastic e p scattering at 6-degrees and 10-degrees. Physical Review Letters, 23, 930–934.
• Borsa, I., Sassot, R., & Stratmann, M. (2017). Probing the sea quark content of the proton with one-particle-inclusive processes. Physical Review D, 96(9), 094020.
• Brandelik, R., Braunschweig, W., Gather, K., Kadansky, V., Lübelsmeyer, K., Mättig, P. . . . Zobernig, G. (1979). Evidence for planar events in e+ e- annihilation at high-energies. Physics Letters B, 86, 243–249.
• Brivio, I., & Trott, M. (2019). The Standard Model as an effective field theory. Physics Reports, 793, 1–98.
• Butterworth, J., Carrazza, S., Cooper-Sarkar, A., De Roeck, A., Feltesse, J., Forte, S. . . . Thorne, R. (2016). PDF4LHC recommendations for LHC Run II. Journal of Physics G,43, 023001.
• Carli, T., Clements, D., Cooper-Sarkar, A., Gwenlan, C., Salam, G. P., Siegert, F. . . . Sutton, M. (2010). A posteriori inclusion of parton density functions in NLO QCD final-state calculations at hadron colliders: The APPLGRID Project. European Physical Journal C, 66, 503.
• Carrazza, S., & Cruz-Martinez, J. (2019). Towards a new generation of parton densities with deep learning models. European Physical Journal C, 79(8), 676.
• Carrazza, S., Degrande, C., Iranipour, S., Rojo, J., & Ubiali, M. (2019). Can New Physics hide inside the proton? Physical Review Letters, 123(13), 132001.
• Carrazza, S., Ferrara, A., Palazzo, D., & Rojo, J. (2015a). APFEL Web: A web-based application for the graphical visualization of parton distribution functions. Journal of Physics G, 42, 057001.
• Carrazza, S., Latorre, J. I., Rojo, J., & Watt, G. (2015b). A compression algorithm for the combination of PDF sets. European Physical Journal C, 75, 474.
• Chatrchyan, S., Khachatryan, V., Sirunyan, A. M., Tumasyan, A., Adam, W., Aguilo, E. . . . Wenman, D. (2012). Observation of a new boson at a mass of $125GeV$ with the CMS experiment at the LHC. Physics Letters B, 716, 30–61.
• Cooper-Sarkar, A., Mertsch, P., & Sarkar, S. (2011). The high energy neutrino cross-section in the Standard Model and its uncertainty. Journal of High Energy Physics, 8, 042.
• de Florian, D., Grojean, C., Maltoni, F., Mariotti, C., Nikitenko, A., Pieri, M. . . . Tanaka, R. (Eds.).(2016). Handbook of LHC Higgs cross sections: 4. Deciphering the Nature of the Higgs Sector, 1610, 19–20.
• Del Debbio, L., Forte, S., Latorre, J. I., Piccione, A., & Rojo, J. (2007). Neural network determination of parton distributions: The nonsinglet case. Journal of High Energy Physics, 3, 039.
• Dokshitzer, Y. L. (1977). Calculation of the structure functions for deep inelastic scattering and $e+e−1$ annihilation by perturbation theory in quantum chromodynamics (In Russian). Journal of Experimental and Theoretical Physics of the Academy of Sciences of the USSR, 46, 641–653.
• Eskola, K. J., Paakkinen, P., & Paukkunen, H. (2019). Non-quadratic improved Hessian PDF reweighting and application to CMS dijet measurements at 5.02 TeV. European Physical Journal C, 79(6), 511.
• Eskola, K. J., Paakkinen, P., Paukkunen, H., & Salgado, C. A. (2017). EPPS16: Nuclear parton distributions with LHC data. European Physical Journal C, 77(3), 163.
• Ethier, J. J., Sato, N., & Melnitchouk, W. (2017). First simultaneous extraction of spin-dependent parton distributions and fragmentation functions from a global QCD analysis. Physical Review Letters, 119(13), 132001.
• Forte, S., & Watt, G. (2013). Progress in the determination of the partonic structure of the proton. Annual Review of Nuclear and Particle Science,63, 291.
• Gao, J., Harland-Lang, L., & Rojo, J. (2018). The structure of the proton in the LHC precision era. Physics Reports, 742, 1–121.
• Gao, J., & Nadolsky, P. (2014). A meta-analysis of parton distribution functions. Journal of High Energy Physics, 1407, 035.
• Gauld, R., & Rojo, J. (2017). Precision determination of the small-x gluon from charm production at LHCb. Physical Review Letters, 118(7), 072001.
• Gell-Mann, M. (1964). A schematic model of baryons and mesons. Physics Letters, 8, 214–215.
• Gribov, V. N., & Lipatov, L. N. (1972). Deep inelastic ep scattering in perturbation theory. Soviet Journal of Nuclear Physics, 15, 438–450.
• Gross, D. J., & Wilczek, F. (1973). Ultraviolet behavior of nonabelian gauge theories. Physical Review Letters, 30, 1343–1346.
• Harland-Lang, L. A., Khoze, V. A., & Ryskin, M. G. (2016). Photon-initiated processes at high mass. Physical Review D, 94(7), 074008.
• Harland-Lang, L. A., Martin, A. D., Motylinski, P., & Thorne, R. S. (2015). Parton distributions in the LHC era: MMHT 2014 PDFs. European Physical Journal C, 75, 204.
• Harland-Lang, L. A., Martin, A. D., Nathvani, R., & Thorne, R. S. (2019). Ad lucem: QED parton distribution functions in the MMHT framework. European Physical Journal C,79, 811
• Harland-Lang, L. A., & Thorne, R. S. (2019). On the consistent use of scale variations in PDF fits and predictions. European Physical Journal C, 79(3), 225.
• Hobbs, T. J., Wang, B.-T., Nadolsky, P. M., & Olness, F. I. (2019). Charting the coming synergy between lattice QCD and high-energy phenomenology. Physical Review D,100, 094040.
• Hou, T. J., Xie, K., Gao, J., Dulat, S., Guzzi, M., Hobbs, T. J. . . . Yuan, C. P. (2019). Progress in the CTEQ-TEA NNLO global QCD analysis. arXiv.
• Klein, M., & Yoshida, R. (2008). Collider physics at HERA. Progress in Particle and Nuclear Physics, 61, 343–393.
• Kovařík, K., Kusina, A., Ježo, T., Clark, D. B., Keppel, C., Lyonnet, F. . . . Yu, J. Y. (2016). nCTEQ15—Global analysis of nuclear parton distributions with uncertainties in the CTEQ framework. Physical Review D, 93(8), 085037.
• Kovařík, K., Nadolsky, P. M., & Soper, D. E. (2020). Hadronic structure in high-energy collisions. Reviews of Modern Physics,92, 045003.
• Kusina, A., Lansberg, J.-P., Schienbein, I., & Shao, H.-S. (2018). Gluon shadowing in heavy-flavor production at the LHC. Physical Review Letters, 121(5), 052004.
• Lin, H. W., Nocera, E. R., Olness, F., Orginos, K., Rojo, J., Accardi, A., . . . Zanotti, J. (2018). Parton distributions and lattice QCD calculations: A community white paper. Progress in Particle and Nuclear Physics, 100, 107–160.
• Manohar, A., Nason, P., Salam, G. P., & Zanderighi, G. (2016). How bright is the proton? A precise determination of the photon parton distribution function. Physical Review Letters, 117(24), 242002.
• Manohar, A. V., Nason, P., Salam, G. P., & Zanderighi, G. (2017). The photon content of the proton. Journal of High Energy Physics, 12, 046.
• Martin, A. D., Roberts, R. G., Stirling, W. J., & Thorne, R. S. (2005). Parton distributions incorporating QED contributions. European Physical Journal C, 39, 155.
• Martin, A. D., & Ryskin, M. G. (2014). The photon PDF of the proton. European Physical Journal C, 74, 3040.
• Nocera, E. R., Ball, R. D., Forte, S., Ridolfi, G., & Rojo, J. (2014). A first unbiased global determination of polarized PDFs and their uncertainties. Nuclear Physics B, 887, 276.
• Patrignani, C., Agashe, K., Aielli, G., Amsler, C., Antonelli, M., Asner, D. M., . . . Zyla, P. A. (2016). Review of particle physics. Chinese Physics C,40(10), 100001.
• Paukkunen, H., & Zurita, P. (2014). PDF reweighting in the Hessian matrix approach. Journal of High Energy Physics, 12, 100.
• Politzer, H. D. (1973). Reliable perturbative results for strong interactions? Physical Review Letters, 30, 1346–1349.
• Pumplin, J., Stump, D., Brock, R., Casey, D., Huston, J., Kalk, J. . . . Tung, W. K. (2001). Uncertainties of predictions from parton distribution functions. II. The Hessian method. Physical Review D, 65, 014013.
• Rojo, J., Accardi, A., Ball, R. D., Cooper-Sarkar, A., de Roeck, A., Farry, S. . . . Thorne, R. (2015). The PDF4LHC report on PDFs and LHC data: Results from Run I and preparation for Run II. Journal of Physics G, 42, 103103.
• Sato, N., Andres, C., Ethier, J. J., & Melnitchouk, W. (2020). Strange quark suppression from a simultaneous Monte Carlo analysis of parton distributions and fragmentation functions. Physical Review D,101, 074020.
• Schmidt, C., Pumplin, J., Stump, D., & Yuan, C. P. (2016). CT14QED parton distribution functions from isolated photon production in deep inelastic scattering. Physical Review D, 93(11), 114015.
• Vincent, A. C., Argüelles, C. A., & Kheirandish, A. (2017). High-energy neutrino attenuation in the Earth and its associated uncertainties. Journal of Cosmology and Astroparticle Physics, 1711(11), 012.
• Wang, B.-T., Hobbs, T. J., Doyle, S., Gao, J., Hou, T.-J., Nadolsky, P. M., & Olness, F. I. (2018). Mapping the sensitivity of hadronic experiments to nucleon structure. Physical Review D, 98(9), 094030.
• Wobisch, M., Britzger, D., Kluge, T., Rabbertz, K., & Stober, F. (2011). Theory–data comparisons for jet measurements in hadron-induced processes. arXiv.
• Zenaiev, O., Geiser, A., Lipka, K., Blümlein, J., Cooper-Sarkar, A., Garzelli, M. V. . . . Starovoitov, P. (2015). Impact of heavy-flavour production cross sections measured by the LHCb experiment on parton distribution functions at low x. European Physical Journal C, 75(8), 396.
• Zweig, G. (1964). An SU(3) model for strong interaction symmetry and its breaking: Version 2. In D. Lichtenberg & S. P. Rosen (Eds.), Developments in the quark theory of hadrons. Volume 1, 1964–1978 (pp. 22–101). Nonantum, MA: Hadronic Press.

### Notes

• 1. The same considerations hold for other hadrons. This article concentrates on the partonic structure of protons, for which far more experimental information is available than for other hadrons.

• 2. Similar factorized expressions apply for processes that feature hadrons in the final state of the collision.

• 3. Where one has defined $fΣ=∑i(fqi+fq¯i)$ and $fT8=fui+fu¯i+fdi+fd¯i−2(fsi+fs¯i)$.