Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Physics. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 10 December 2022

# Philosophical Issues in Thermal Physics

• Wayne C. MyrvoldWayne C. MyrvoldUniversity of Western Ontario

### Summary

Thermodynamics gives rise to a number of conceptual issues that have been explored by both physicists and philosophers. One source of contention is the nature of thermodynamics itself. Is it what physicists these days would call a resource theory, that is, a theory about how agents with limited means of manipulating a physical system can exploit its physical properties to achieve desired ends, or is it a theory of the basic properties of matter, independent of considerations of manipulation and control? Another source of contention is the relation between thermodynamics and statistical mechanics. It has been recognized since the 1870s that the laws of thermodynamics, as originally conceived, cannot be strictly correct. Because of fluctuations at the molecular level, processes forbidden by the original version second law of thermodynamics are continually occurring. The original version of the second law is to be replaced with a probabilistic version, according to which large-scale violations of the original second law are not impossible but merely highly improbable, and small-scale violations unpredictable, unable to be harnessed to systematically produce useful work. The introduction of probability talk raises the question of how we should conceive of probabilities in the context of deterministic physical laws.

### Subjects

• History of Physics
• Physics and Philosophy

### 1. Introduction

There are a number of conceptual issues raised by thermal physics. These are of two sorts. One has to do with the nature of thermodynamics itself; the other has to do with the relation between the concepts and laws of thermodynamics and those of statistical mechanics.

Thermodynamics relies, for its formulation, on a distinction between two modes of energy transfer between systems: as heat, and as work. It also relies on a distinction between thermodynamically reversible and thermodynamically irreversible processes. It is in terms of these two distinctions that thermodynamic entropy is defined. There are two ways to think about these distinctions. One way, stemming from the historical roots of the theory in the analysis of efficiency of heat engines, has it that thermodynamics is not a theory of basic or fundamental physics but, rather, a theory about how agents with limited knowledge about the state of a physical system and limited means of manipulating it can exploit its physical properties to obtain desired ends, such as obtaining useful work. That is, thermodynamics is seen as a resource theory, or family of resource theories. On this view, it is natural for the work/heat distinction and the reversible/irreversible distinction, and with them the notion of thermodynamic entropy, to be relative to a specified class of manipulations. Another view has it that thermodynamics has been liberated from its roots in technological considerations, and that thermodynamic quantities such as the entropy of a system can be regarded as intrinsic properties of physical systems.

There is also a cluster of questions about the relation of thermodynamics to mechanics. For reasons that will be reviewed in the next section, it is clear that the laws of thermodynamics, as originally conceived, cannot be derived from the underlying microphysics, be it classical or quantum, alone. This raises the question of what is to be added to statistical mechanics to obtain analogues of the laws of thermodynamics.

### 2. Historical Sketch

The term “thermodynamics,” was coined by Kelvin for the emerging science of the interrelations between heat and mechanical work. The term is formed from the Greek words for heat and power, and flags a distinction that is central to the subject, a distinction between energy transfer as heat and as work. It has its roots in the experiments of Joule and others on the mechanical equivalent of heat, and in Carnot’s investigations into the maximum efficiency of heat engines (Carnot, 1824).

From these two sources are derived what Kelvin identified as the two fundamental principles, or laws, of thermodynamics. The first has to do with the existence of a mechanical equivalent of heat. It states that equal expenditures of mechanical work, converted entirely into heat, produce equal quantities of heat (measured calorimetrically), independent of temperature. This permits a quantity of heat transferred between two bodies to be measured in terms of its mechanical equivalent, in units of energy, and permits conception of the total internal energy of a system, which may be altered in two ways: by transfer of heat into or out of the system, or by work done on or by the system. The second law of thermodynamics states that heat cannot be transferred from a cooler to a warmer body without some expenditure of work or other change in another system that amounts to a loss of opportunity to extract useful work from the system.

In light of the first law of thermodynamics, it is natural to think of the internal energy of a body as realized in part by the kinetic energy of component parts too small to be perceived directly. Unsurprisingly, some of the pioneers of thermodynamics were also engaged in developing the kinetic theory of heat, on which an increase of temperature of a body is an increase in the kinetic energy per molecule. Conceiving of heat in this way brings the study of heat into the domain of mechanics (to 19th-century researchers this meant, of course, classical mechanics, but the chief conceptual issues remain if the underlying mechanics is taken to be quantum). This, in turn, raises the question of the relation between the concepts of thermodynamics and those of the underlying mechanics, and of the laws of thermodynamics to those of mechanics.

During the decade 1867–1877, it came to be realized by all of the major figures involved that thermodynamical phenomena, taken to include the existence of irreversible processes, could not be explained as a consequence of mechanics alone, and that other conceptual resources would have to come into play. One argument to this effect is the reversibility argument, which first appears in a penciled annotation on a letter from Maxwell to P. G. Tait of December 11, 1867 (Harman, 1990, p. 332; Knott, 1911, p. 214), and makes its way into the scientific literature in Thomson (1874). The argument is simple. If (as was generally assumed) the interactions between molecules are functions of intermolecular distances alone, then the Hamiltonian of a system consisting of such molecules is quadratic in any quantities that change sign under a reversal of velocities. It follows from this that the dynamical laws are invariant under velocity-reversal; for any dynamically possible trajectory through phase space, there is a corresponding reversed trajectory that is also dynamically possible, on which the paths of all the molecules are traced in reversed order. This means that temporally asymmetric phenomena, such as a tendency for heat to flow spontaneously from a warmer to a cooler body, or more generally a tendency of bodies to relax toward equilibrium states, cannot be a consequence of molecular dynamics alone.

In Boltzmann’s work, reversibility considerations came to the fore in connection with the question of the meaning of his H-theorem. In 1872 Boltzmann had proven a theorem, which has come to be called the H-theorem, that gave the appearance of deriving, from mechanical considerations alone, a conclusion that the distribution of velocities in a gas always monotonically approaches a Maxwell-Boltzmann distribution (the generalization of the Maxwell distribution to a gas subjected to an external force, such as gravity), independently of the initial state of the gas. This is, of course, impossible, on reversibility grounds. It was via a paper by Loschmidt (1876) that Boltzmann’s attention was drawn to reversibility considerations, though criticism of the H-theorem was not Loschmidt’s chief aim in the paper (see Darrigol, 2018a, pp. 181–188 for discussion).

Another argument for a similar conclusion, advanced by Zermelo (1896), stems from the fact that there is a measure on phase space, the Liouville measure, that is invariant under Hamiltonian evolution. Consider a system constrained to remain within a region of phase space of finite measure (e.g., a system confined to a box). Take any function on phase space that is integrable with respect to the Liouville measure. The integral of this function with respect to the Liouville measure will be unchanged under dynamical evolution. This has the consequence that, if the function increases along some set of trajectories of nonzero measure, it must decrease along others. Therefore, if the entropy of a system is determined by its microstate, a monotonic increase of entropy cannot be a consequence of the laws of motion. As pointed out by Poincaré (1893) and Zermelo (1896), the same conclusion follows from the Poincaré recurrence theorem (first proved by Poincaré 1890; see, e.g., Berger, 2001 for a contemporary exposition). A system that is confined to a box and starts out in a non-equilibrium macrostate may evolve to an equilibrium macrostate in which it will stay for an enormously long time, but it cannot be expected to remain there forever. See Brown et al. (2009) for further discussion of these arguments.

The lesson drawn from the reversibility argument, Zermelo’s measure-theoretic argument, and the Poincaré recurrence theorem was that the ubiquitous tendency of systems to approach thermodynamic equilibrium should not be thought of as an exceptionless law. Macroscopically observable exceptions are not physically impossible. This means that it is not impossible, but at most highly improbable, for a system that has relaxed to a state of macroscopic equilibrium to spontaneously evolve to a non-equilibrium state from which work can be extracted. This motivates a re-conception of the second law of thermodynamics: it should not be thought of as a law that holds strictly and without exceptions. However, although minute violations of the second law, as originally conceived, would be continually occurring at the microscopic level, macroscopically noticeable departures from the law may be expected to be so rare as to be negligible. Maxwell (1871, 1873, 1878) declared the second law to be a regularity of the same sort found in population-level statistics of human populations, a statistical regularity, resulting from aggregation of large numbers of individually unpredictable events. It is a law of this sort, qualified by statistical and probabilistic considerations, that physicists accept today as the second law of thermodynamics. Because what is claimed is that a system in a non-equilibrium state will probably move closer to equilibrium, this means that probabilistic considerations must be introduced. This raises the question of the meaning and justification of probabilistic statements in this context.

In his response (1877a) to the reversibility argument, Boltzmann acknowledged that his conclusion that a gas equilibrates must be regarded as only probable, and other sorts of behavior must be regarded as improbable but not impossible. He remarked, “One could even calculate, from the relative numbers of the different state distributions, their probabilities” (Boltzmann, 1966b, p. 192, from Boltzmann, 1877a, p. 71, in Boltzmann, 1909b, p. 121.) He followed up on this later that year, with a publication in which he demonstrated a relation between the quantity H and what he called the “measure of permutability” (Permutabilitätsmaß), and showed that, for an ideal gas, this quantity is, up to an additive constant, proportional to the entropy. He also remarked that it retained meaning during a non-reversible process (Boltzmann, 1877b, p. 428, in Boltzmann, 1909b, p. 217; translation in Sharp & Matschinsky, 2015, p. 2005. See also Boltzmann, 1896, §8). This provided Boltzmann with a new way of thinking about the approach to equilibrium, in terms of what Darrigol (2018a) has called the “Boltzmann principle,” which says that an isolated system always evolves from a less probable to a more probable state.

Boltzmann’s remarks about the relation between entropy and the measure of permutability were taken as a fundamental principle in the work of Planck (1901, p. 556). In his first uses of this principle, Planck followed Boltzmann in taking entropy to be defined only up to an additive constant. Later, Planck concluded that the phase space of a system actually consists of cells of minimum size, on the basis of which an absolute value of entropy could be defined (see Planck, 1913, §120). He concluded this on the basis of Nernst’s heat theorem, which, in Planck’s formulation (more general than that of Nernst), is:

as the temperature diminishes indefinitely the entropy of a chemically homogeneous body of finite density approaches indefinitely near to a definite value, which is independent of its pressure, its state of aggregation, and of its special chemical modification.

(Planck, 1922, §282)

Plank also introduced what is now called “Boltzmann’s constant,” k, producing the formula inscribed on Boltzmann’s tomb (Planck, 1913, p. 117; Planck, 1914, p. 119),

$Display mathematics$(1)

Though a formula equivalent to this occurs in Boltzmann’s Lectures on Gas Theory (I, §8; II, §61), Planck by this time conceived of it somewhat differently from Boltzmann. Boltzmann took it to define a quantity that, like thermodynamic entropy, is defined only up to an additive constant, freely discarding terms that depend only on N, the number of molecules, and did not use it to compare entropies of states of differing N. Planck, however, took the lesson of Nernst’s heat theorem to be that one can assign an absolute value to entropy. Once the formula (1) is taken to define an absolute entropy, rather than merely an expression that yields entropy differences between states of the same N, this raises the question of whether it gives the correct dependence on N.

#### 2.1 Two Conceptions of Thermodynamics and of Thermodynamic States

From the early days of thermodynamics and the investigations into the implications for thermodynamics of the kinetic theory of heat, there has been a conception of thermodynamics according to which it is not a theory of pure physics but what would now be called a resource theory (or family of resource theories). A resource theory has to do with how agents with access to specified physical and informational resources and the ability to perform specified operations can use those resources to accomplish specified tasks. This view of thermodynamics as a resource theory was clearly articulated by Maxwell (see Myrvold, 2011 for details) and can be found in a number of authors throughout the history of thermodynamics; see Myrvold (2020) for a presentation and defense of this view, and for some of the relevant historical quotations. Until recently, it has been a minority view, but it has seen a resurgence within quantum thermodynamics (Goold et al., 2016; Gour et al., 2015; Lostaglio, 2019; Ng & Woods, 2018). A contrasting view has it that, though thermodynamics has its roots in investigations of the ways in which physical systems can be manipulated to perform useful work, in its mature form it can be severed from these roots. A view of this sort was perhaps first clearly articulated by Planck in a series of lectures on theoretical physics delivered in 1909 (Planck, 1910, first and third lectures). These two conceptions of thermodynamics will be referred to as Maxwellian and Planckian views, respectively.

The two views differ on the nature of the thermodynamic state. The thermodynamic state of a system is defined by its total internal energy E (alternatively, temperature T), and one or more variables X1, X2, . . . Xn. These may include the volume of a container the system is confined to, as well as external applied fields. There are two related, but conceptually distinct, conceptions of the variables X1, X2, . . . Xn that are used to define a thermodynamic state. The standard conception is that they are the macroscopic variables—which variables these are is determined by the resolution of macroscopic measurement apparatus. On the resource-theoretic view, these are the variables that can be manipulated to do work on the system.

To illustrate the difference between these two conceptions, consider the case of interdiffusion of gases, which was discussed by Gibbs (1875, pp. 228–229, 1906, pp. 166–167), and which has been the topic of considerable discussion since that time. A container is divided by a partition into two subvolumes, each containing samples of gas at the same pressure, and maintained at the same temperature via contact with a heat bath. The partition is removed, and the gases interdiffuse, until each is equally distributed within the whole volume. Has there been an increase of entropy, or not?

The now-standard answer, given by Gibbs, is that if the gases initially in the two subvolumes are of the same type, there has been no change of thermodynamic state, and ipso facto no change in entropy. If the two subvolumes initially contain gases of different types, initial and final states of the contents of the container are distinct thermodynamic states, and the entropy of the final state is higher than that of the initial state. This entropy increase is known as the entropy of mixing.

The two conceptions of thermodynamic state yield different criteria for distinctness of the two gases. On the Planckian view, a thermodynamic state is defined with respect to macroscopically measurable variables, and the two gases are distinct if and only if they can be distinguished via a macroscopic measurement. On the Maxwellian view, the two gases count as distinct if and only if they can be differentiably manipulated, in such a way that the expansion of the gases could be exploited to obtain work.

If there is a reversible process by which the gases, having been mixed, can be separated, with expenditure of work, then the interdiffusion represents a lost opportunity to extract work from the system, as the separation process could be run in reverse, mixing the two gases while obtaining work. A standard thought experiment used to calculate the entropy of mixing involves consideration of pistons permeable to one of the gases and not to the other. (This thought experiment, in its familiar form, is found in Planck, 1897, §236. It is based on earlier use of the device of semipermeable pistons by Boltzmann, 1878). These could be used to expand the gases while in contact with a heat bath that maintains a constant temperature; the heat that flows into the gases from the heat bath, divided by the temperature, yields the entropy increase.

On the Maxwellian, resource-theoretic view of thermodynamics, a process of this sort, in which molecules of the two gases are differentially manipulated, must be available in order for there to be an entropy difference between initial and final states. An observable difference between the two gases, if it could not be exploited for the purposes of obtaining work, would not suffice. Thus, on this view, two states that are macroscopically distinguishable might nevertheless be counted as not thermodynamically distinct.

### 3. Entropies and Their Relations

#### 3.1 Thermodynamic Entropy

The definition of thermodynamic entropy should be familiar to most readers. It was defined on the basis of the second law by Clausius (1854), who at that time called it the equivalence-value (Aequivalenzwerth) of the heat transferred in a reversible process. In 1865 he coined the term Entropie, from the Greek ἡ τροπὴ‎, transformation.

The version of the second law of thermodynamics relevant to defining entropy is the Clausius formulation, that it is impossible for a system operating in a cycle to have no other net effect than to transport heat from a cooler to a warmer body. It follows from this that no heat engine operating between two reservoirs can be more efficient than a reversible engine, and that all reversible engines have the same efficiency (Carnot’s theorem). And from this it follows that, if a system undergoes a process that leaves it in the same thermodynamic state it started in, exchanging heats Qi with reservoirs at temperatures Ti (counting heat entering the systems as positive and leaving it as negative),

$Display mathematics$(2)

with equality only if the process is reversible. This implies that, if two thermodynamic system states a and b can be connected reversibly, the quantity

$Display mathematics$(3)

must be the same for any reversible process connecting a to b. This permits a state function S to be defined, which is unique up to an additive constant, such that

$Display mathematics$(4)

for any reversible process that takes a to b.

#### 3.2 Statistical-Mechanical Entropies

There are several statistical mechanical quantities that have been called “entropies” and that bear some relation to thermodynamical entropy.

It has become common in the literature on the foundations of statistical mechanics to say that there are two approaches to statistical mechanics, referred to as Boltzmannian and Gibbsian. This terminology is misleading, because both have their roots in Boltzmann’s work. In the philosophical literature these approaches are often treated as rivals. A common view in the philosophical literature is that “Boltzmannian” techniques are foundational, and that the role of “Gibbsian” techniques is to render certain calculations tractable. In textbooks of statistical mechanics, on the other hand, it is rare to see a clean distinction between the approaches, and techniques that are associated with the two approaches are freely intermingled. This is in keeping with the attitude of the founders of the subject; Boltzmann thought of the various techniques he introduced as complementary and was interested in demonstrating connections between them (see Darrigol, 2018a). Gibbs himself presented his work as building on and extending that of Boltzmann.

Associated with these approaches are quantities that have come to be called Boltzmann entropy and Gibbs entropy, the quantum-mechanical version of which is von Neumann entropy. The Gibbs entropy will be presented first, as the relation between Boltzmann entropy and thermodynamic entropy is most clearly demonstrated with reference to Gibbs and von Neumann entropies.

##### 3.2.1 Gibbs Entropies

It follows from the reversibility argument that, if an explanation is to be given in microphysical terms of the tendency of systems to equilibrate, this cannot follow from the microphysics alone, and the expected behavior cannot hold for all microstates. Moreover, temporary movement away from an equilibrium macrostate should not be regarded as impossible, but at best improbable. This suggests that probabilistic considerations are to be brought into play. In this spirit, the “Gibbsian” framework examines the evolution of probability distributions on the state spaces of the systems considered.

Boltzmann (1871, 1884, 1898) found it helpful to consider the evolution of a hypothetical collection of systems all having the same values of energy and other macrovariables but differing in their microstates, as did Maxwell (1879) and Gibbs (1902), the latter of whom referred to a collection of this sort as an ensemble. This procedure has given rise to the misconception that results thereby obtained are not about individual systems, but about these hypothetical ensembles. This is incorrect. If one is interested in the behavior of an individual system, but the concern is not with its behavior given some specific initial condition, but under a range of initial conditions, it can be useful to examine the behavior of a collection of trajectories, not merely a single one. Moreover, if probabilistic considerations are to be brought into play, an ensemble of this sort can be a useful way of visualizing a probability distribution on the system’s state space.

Gibbs himself alternated between talk of ensembles and talk of distributions. This is in accord with the then-prevalent frequentism about probabilities. Nothing in the method, however, is wedded to a frequentist interpretation of probability. In what follows, distributions will be spoken of, to emphasize that probabilistic considerations are being brought to bear, without commitment to a frequentist interpretation of probability.

First, there are a few considerations relevant to probability distributions on classical phase space. For a system of n degrees of freedom, the 2n-dimensional phase space of the system can be parameterized by a set of parameters {q1, q2, . . ., qn, p1, p2, . . ., pn} consisting of n coordinates and their conjugate momenta. One defines, in the usual way, the Lebesgue measure on the parameter space. This measure assigns to any “rectangle” with sides of lengths

$Display mathematics$

a measure equal to the product of its sides. The measure is extended to more complicated sets by invoking the usual assumption of countable additivity, that is, the condition that, for any sequence {An} of disjoint sets, the probability assigned to the union of all these sets is the sum of the probabilities of all the sets in the sequence. Details of the construction of the Lebesgue measure can be found in many textbooks of probability; see, for example, Billingsley (2012). It can be shown that the measure on phase space induced by the Lebesgue measure on the space of parameters is the same for any set of canonical coordinates; this is the Liouville measure, also called the phase-space volume.

This measure has two features that make it of particular interest: it is invariant under canonical transformations of coordinates, and it is conserved under Hamiltonian evolution, whether or not the Hamiltonian is time independent.

Consider a physical system with state space$Γ$, and a dynamical evolution on the state space represented by a family {Tt} of mappings $Tt:Γ→Γ$ for all t in some interval. These mappings are usually assumed to be invertible, but noninvertible dynamics can also be considered. The dynamical evolution of states of the system induces an evolution of probability distributions. Let P0 be a probability distribution over the state of the system at some time, which is taken to be t = 0. Then the probability that, at time t, the system is in a subset Δ‎ of the state space, is given by,

$Display mathematics$(5)

The evolution Tt is said to conserve probability P0 if the probability assigned to the evolute $TtΔ$ of any measurable set $Δ$ is the same as the probability assigned to $Δ$; that is, if, for all t,

$Display mathematics$(6)

The probability distribution P0 is invariant, or stationary, under the evolution if Pt is the same for all t; that is, if, for all t,

$Display mathematics$(7)

Any conserved probability measure is also invariant. If the dynamical map is invertible, any invariant measure is conserved, but, if it is not invertible, there could be an invariant measure that is not conserved.

If a distribution Pt on a classical phase space has density $ρt$ with respect to the Liouville measure, the evolution of this density under Hamiltonian evolution satisfies Liouville’s equation.

$Display mathematics$(8)

Any distribution represented by a density function that is a function only of conserved quantities is an invariant distribution. In particular, if the Hamiltonian is time-independent, any distribution represented by a density function that is a function only of the Hamiltonian is invariant.

Three probability distributions are of particular interest. The first, deemed appropriate for an isolated system of known energy E, is the microcanonical distribution. This is the restriction of the Liouville measure to a narrow shell consisting of all states with energy in the interval $EE+δE$. The second is the canonical distribution, appropriate for a system in thermal equilibrium with a heat bath. A canonical distribution has a density, with respect to the Liouville measure, given by,

$Display mathematics$(9)

where $β=1/kT$, and $Zβ$ is the normalization constant required to make the integral of this function equal to unity. Considered as a function of $β$ and any external parameters on which the Hamiltonian may depend, it is called the partition function. In order for this to be well-defined, it is necessary for there to be restrictions (such as walls restricting the system to finite volume) in place that ensure convergence of this integral.

These distributions apply to systems of a fixed, finite number of degrees of freedom. Systems that do not have a fixed number of components can also be considered, for example, a system that can exchange molecules with its environment. A grand canonical distribution for a system containing components of r different types is a mixture of canonical distributions, one for each possible specification of the numbers (n1, . . ., nr) of components of each type.

Gibbs proposed statistical-mechanical analogues of thermodynamic entropy for each of these distributions. For the microcanonical distribution, his first proposal is

$Display mathematics$(10)

where $ΩE$ is the phase-space volume of the set of states satisfying relevant external constraints and having energy less than or equal to E (Gibbs, 1902, p. 170). A few pages later, he considers another analogue, namely,

$Display mathematics$(11)

where $ωE$ is what is called the structure function,

$Display mathematics$(12)

An energy shell of small width $δE$ containing the surface of energy E has phase-space volume approximately equal to $ωEδE$.

There has been extensive discussion whether the quantity (10), called the volume entropy, or the quantity (11), the surface entropy, is a more appropriate analog of thermodynamic entropy for an isolated system. See Abraham and Penrose (2017), Hilbert et al. (2014), Lavis (2019), and references therein.

Now consider the analogue of entropy associated with canonical distributions. Recall that the definition of thermodynamic entropy requires initial and final states to be connected by a reversible process. Two canonical distributions are imagined to be connected via a gradual process, at each stage of which the system is in contact with a heat reservoir (which may be different reservoirs, at different temperatures, at different stages), and which proceeds sufficiently slowly that it can be treated as if it is in thermal equilibrium at all times. The Hamiltonian of the system is supposed to depend on parameters X = {X1, X2, . . ., Xn}, which can be manipulated to do work on, or extract work from, the system.

Let $ρ1$ and $ρ2$ be density functions for canonical distributions that differ only slightly, corresponding to parameters $Xβ$ and $X+dXβ+dβ$, respectively. Then, to first order in the parameter differences,

$Display mathematics$(13)

The first term on the right-hand side of this equation is the expectation value of the work done in changing the external parameters; the second term is the expectation value of the heat obtained from the reservoir,

$Display mathematics$(14)

This yields

$Display mathematics$(15)

This means that, if the parameters of the system are varied very slowly, relative to the equilibration timescale of the system, and the system is placed in thermal contact with a series of heat reservoirs with very small temperature differences between successive reservoirs,

$Display mathematics$(16)

The approximation can be made arbitrarily close by making the process sufficiently slow; the limiting value approached is called the quasi-static limit.

Define

$Display mathematics$(17)

Then, in the quasi-static limit, (16) becomes,

$Display mathematics$(18)

The quantity SG is what is usually called the Gibbs entropy. For a pair of canonical distributions, the Gibbs entropy plays a role analogous to thermodynamic entropy, with actual heat exchanges replaced by expectation values of heat exchanges.

The relation (18) concerns a pair of canonical distributions. For other distributions, an inequality relating $SGρ$ to expectation values of heat exchanges can be derived. Suppose that a system A, that has at time t0 a probability distribution $ρAt0$, interacts successively with heat reservoirs Bi, whose distributions at t0 are canonical distributions, with temperatures Ti, uncorrelated with A. Between times t0 and t1, the composite system consisting of A and the reservoirs undergoes Hamiltonian evolution, with a Hamiltonian that may be time-varying. Suppose, further, that at t0 and t1, the system A is not interacting with any of the reservoirs, and that during the evolution the reservoirs interact only with A. Under these assumptions, the expectation value of the energy exchanged between A and Bi is

$Display mathematics$(19)

It can be shown (Gibbs, 1902, pp. 160–164) that, under the conditions outlined,

$Display mathematics$(20)

It follows from this that, for a cyclic process, in which the marginal distribution of A at t1 is the same as at t0,

$Display mathematics$(21)

This is, of course, analogous to the second law of thermodynamics, with expectation values for heat exchanges replacing actual values.

Equations (18) and (20) are analogues of thermodynamic relations, with expectation values of energy changes replacing actual values and the Gibbs entropy replacing thermodynamic entropy. If the probability distributions for the quantities involved are tightly focused around their expectation values, then, with high probability, the actual value will be near the expectation value, and it might for certain purposes be justifiable to disregard the differences between expectation values and actual values. This will be the case if the system consists of a large number of components, and the kinetic energies of the components are large compared to the interaction energies. For such a system, on a canonical distribution, the energy will be the sum of a large number of approximately independent variables, and the weak law of large numbers, which says that, with high probability, the actual value will be close to the expectation value, applies. See Gibbs (1902, p. 168); see also Werndl and Frigg (2020) for examination of conditions under which approximate equality obtains.

These relations provide useful information even when the variances of the energy exchanges for the probability distributions considered are large. They still show that it is not possible to consistently and reliably obtain more work from a cycle of an engine than permitted by the second law. Applied to an engine operating in a cycle involving a hotter reservoir at temperature T1 and a colder reservoir at temperature T2, equation (20) shows that the expectation value of work obtained in a single cycle satisfies,

$Display mathematics$(22)

where ⟨Q⟩ is the expectation value of heat obtained from the hotter reservoir. Thus, the Carnot bound on efficiencies of heat engines becomes a bound on the expectation value of work obtained.

If the process connecting the states of A at t0 and t1 is reversible—that is, if it is possible for there to be another process that interchanges initial and final marginal distributions for A and reverses the signs of the quantities $Qi$—then equality in equation (20) is obtained. The Gibbs entropy is, therefore, an analogue of thermodynamic entropy for any two probability distributions, whether or not they are canonical distributions, that can be connected by a process that is reversible in this sense. It would be a mistake to treat the difference in Gibbs entropy as an analogue of thermodynamic entropy for probability distributions that cannot be connected reversibly. See Myrvold (2020, §6) for further discussion of the relation between SG and thermodynamic entropy.

##### 3.2.2 Von Neumann Entropy

The von Neumann entropy of a quantum state that is represented by a density operator $ρ̂$ is defined as

$Display mathematics$(23)

Everything that was said in the previous section about the Gibbs entropy can be said about the von Neumann entropy. In particular, the quantum analogues of equations (18) and (20) hold.

##### 3.2.3 Boltzmann Entropy

In this section, the procedure of Boltzmann (1877b, 1896), summarized in Ehrenfest and Ehrenfest (1912), is followed. For simplicity, a system that consists of a large number N of identical molecules, each with r degrees of freedom, is considered. The generalization to systems consisting of several types of molecules is straightforward. Let $μ$ be the 2r-dimensional phase space of an individual molecule and let $Γ=μN$ be the 2rN-dimensional phase space of the entire system of N molecules.

Partition $μ$ into small regions $ωi$ of equal Liouville measure $ω$, corresponding to small intervals of values of each of the coordinates and momenta. This induces a coarse graining of $Γ$ into cells of volume $ωN$. A complexion is an assignment of each molecule to the phase-space cell in which its phase point lies. A specification, for each $ωi$, of the number ni of molecules whose state lies in that region is called a state distribution.

For each state-distribution Z there is a corresponding subset $ΓZ$ of $Γ$, consisting of phase points that yield the state distribution Z (such a region is called, by the Ehrenfests [Ehrenfest & Ehrenfest, 1959/1912], a “Z-star”). The Z-star corresponding to a state-distribution $Z=ni$ is a union of elementary cells, each with Liouville measure $ωN$. The number of such cells is equal to

$Display mathematics$(24)

This is the permutability of the state distribution, equal to the number of complexions compatible with it. Consider the logarithm of this quantity,

$Display mathematics$(25)

If the sum is dominated by terms involving numbers large enough that Stirling’s approximation formula,

$Display mathematics$(26)

is valid, then

$Display mathematics$(27)

As he was concerned only with comparing states in which the total number of molecules, N, is constant, Boltzmann discarded the terms that, for fixed N, do not vary between state-distributions, and considered the quantity,

$Display mathematics$(28)

Boltzmann remarked, in passing, that working with this quantity, rather than the logarithm of the permutability $P$, renders it an extensive quantity; its value for a composite system consisting of two disjoint subsystems is the sum of its values for the component subsystems (Boltzmann, 1877b, in Boltzmann, 1909b, p. 192; Sharp & Matschinsky, 2015, p. 1990).

Suppose that the actual state of the gas is equally likely to be in any of the elementary cells into which its phase space $Γ$ has been partitioned. The most probable state distribution will be the one that minimizes H. Moreover, for a monatomic ideal gas confined to a volume V and having total energy E, the value of H, for this maximally likely state distribution, is

$Display mathematics$(29)

The negative of this is proportional to the thermodynamic entropy of an ideal gas.

As the negative of H(n) differs from log $Pn$ by an additive constant, and the phase-space volume of the Z-star corresponding to state description n is proportional to $Pn$, the negative of H(n) is (up to an additive constant) equal to the logarithm of the Liouville measure of the Z-star. For a macroscopic system, provided the phase-space cells are not too small, the macrostate that minimizes H will take up all but a negligible fraction of the phase space available to the system, given the macroscopic constraints. Assuming that a system, left to itself, will tend to spend most of its time in the macrostate that has the largest phase-space volume, out of the macrostates available to it, this macrostate is the equilibrium macrostate. Thus, the thermodynamic entropy is, up to an additive constant, proportional to the logarithm of the phase-space volume available to the system.

This suggests the following generalization of Boltzmann’s considerations to cases in which macrostates are not required to be a function only of occupation numbers of regions of the single-particle space $μ$. Suppose the macrostate of the system is defined by the values of macrovariables {X1, . . ., Xk}. Partition the accessible phase space $Γ$ into regions corresponding to small intervals of values of these macrovariables; each such region consists of points that, for practical purposes, have the same values of the macrovariables. For any microstate x, let M(x) be the macrostate it lies within, and let $∣Mx∣$ be the Liouville measure of that macrostate. Then the entropy assigned to a phase point x is given by

$Display mathematics$(30)

It is this generalization that is referred to as Boltzmann entropy in current presentations of the Boltzmannian approach to statistical mechanics. See, for example, Goldstein (2001) and Lebowitz (1993, 1999b). This generalization is required to take into account systems for which interparticle potentials are significant for the macrostate, and for which −kHmax is not an adequate representation of the entropy. See Jaynes (1965, 1971) for discussion.

Equation (30) might appear to assign an entropy to a system that is a property of the physical state of the system alone. But note that the value of the Boltzmann entropy depends not only on the phase point x but also on the macrovariables chosen to define macrostates (selected as of interest either as the ones that are measurable or for some other reason), on a partition of the macrovariables into sets fine enough that differences within a set are regarded as negligible, and on a choice of measure of subsets of phase space, the Liouville measure.

For an ideal gas, an explicit calculation shows that the dependence of SB on total energy and volume is the same as that of thermodynamic entropy. The question arises whether this will hold also for other systems. As already mentioned, for a system with a large number of components, if the energy is primarily kinetic energy, a canonical distribution will be tightly focused on the mean value of energy. Expectation values for macroscopic variables can be expected to approximate values calculated for a microcanonical distribution at that mean energy. Suppose, now, that M1 and M2 are two macrostates to which the system could be confined for appropriate constraints, and that it is possible to move from one to the other via slow, gradual variation of the constraints. Then the argument that led to equation (18) leads to the conclusion that the difference between Gibbs entropies corresponding to canonical distributions subjected to those constraints approximates the difference between thermodynamic entropies of the corresponding macrostates. If, then, the system is one for which Gibbs entropy differences approximate differences in Boltzmann entropy, the Boltzmann entropy differences will approximate differences in thermodynamic entropy. This does not imply that Boltzmann entropy differences will correspond, even approximately, to differences in thermodynamic entropy for systems that do not satisfy these conditions.

### 4. Status of Probabilities

It has become commonplace in the literature on philosophy of probability to note that the word “probability” is used in more than one sense (see Hacking, 1975 for the history of this). Senses of the word “probability” are typically grouped into two classes. One class is epistemic, having to do with degrees of belief of a (possibly idealized) agent with limited knowledge about the world. These degrees of belief are often called credences. The other is objective and physical. A probability in this sense is an objective feature of a physical situation, or chance setup, such as found in games of chance, and may have a value unknown to anyone. These are often called chances. These are not to be thought of as rivals for the unique, correct meaning of “probability”; they are distinct but related concepts, each with its own role to play.

There is a long history of attempts to construe probability as merely a matter of counting possibilities, a view that is sometimes, incorrectly, attributed to Laplace (1814). It is often associated with a Principle of Indifference, which is intended to prescribe unique probabilities in a state of absolute ignorance. A view of this sort is sometimes referred to as classical probability.

In the latter half of the 19th century, it became common to take statements about objective probabilities to involve implicit reference to relative frequency in an actual or hypothetical sequence of similar events, a view known as frequentism about probabilities. Among the prominent exponents of this view are Venn (1866) and Richard von Mises (1928).

Both the classical conception and frequentism are fraught with difficulties. As Laplace clearly saw, a conception of probability as counting of possibilities requires as input a partition of ways the world can be whose elements have already been judged to be equiprobable. This is not to be obtained from mere ignorance, and the impression that it can be seems to stem from conflating an absence of judgement about two possibilities with a positive judgement that they are equiprobable (or, as he said, “equally possible”). Frequentism faces difficulties of its own, among which is what has been called the reference class problem. One and the same event will be part of an indefinite number of different sequences, with different limiting frequencies; unless there is a unique way to discriminate among these, the event will be assigned different probabilities, depending on the sequence to which it is assigned. For further criticisms of frequentism, see La Caze (2016) and Myrvold (2021a).

There have been attempts to construe probabilities in statistical mechanics as purely subjective; see Uffink (2011) for an overview. These face a difficulty in that predictions are made based on statistical-mechanical probabilities, which are tested by experiment.

Another proposal that has been made is that probabilities, even in classical statistical mechanics, are to be derived from quantum-mechanical probabilities. See Albert (2000, 1994), Albrecht and Phillips (2014), and Wallace (2016).

Another suggestion is that understanding the use of probabilities in statistical mechanics requires going beyond the familiar dichotomy of objective chance and subjective credence, and that a hybrid conception of probability invoking both epistemic and physical considerations is needed; see Myrvold (2012, 2021a).

### 5. Justifying Choice of Equilibrium Measures

A classical microcanonical distribution is uniform, in canonical phase-space variables, within a small energy shell. The quantum version is an equally weighted mixture of energy eigenstates with eigenvalues within a small interval. The question to be addressed is, What distinguishes this measure from other measures on an energy shell that might be chosen?

Part of the answer to this question lies in the fact that the question concerns equilibrium measures. Under the Gibbsian approach, thermal equilibrium is not to be thought of as a static state; it is one in which the microstate is constantly changing and the macrostate, though approximately constant most of the time, is subject to fluctuations, with large fluctuations (for macroscopic systems) being much rarer than small ones. If these fluctuations are unpredictable, an appropriate probability distribution will have the probability of a given fluctuation be constant with time. This means that the equilibrium distributions should be stationary distributions.

It follows from the Liouville equation that, for a conservative system, any distribution given by a density function that is a function of the energy is a stationary distribution. It is, therefore, easy to see that the microcanonical and canonical distributions are stationary.

#### 5.1 The Hypothesis of Uniform a Priori Probabilities

In an influential textbook published in 1938, Tolman introduced what he called “the fundamental hypothesis of equal a priori probabilities.” In connection with this, he wrote,

Although we shall endeavour to show the reasonableness of this hypothesis, it must nevertheless be regarded as a postulate which can ultimately be justified only by the correspondence between the conclusions which it permits and the regularities in the behaviour of actual systems which are empirically found.

(Tolman, 1938, p. 59)

Tolman argued for the reasonableness of this postulate on the basis of Liouville’s theorem, which entails that a distribution uniform in canonical phase space variables is a stationary distribution. This shows that “the principles of mechanics do not themselves include any tendency for phase points to concentrate in particular regions of the phase space” (p. 61).

Under the circumstances we then have no justification for proceeding in any manner other than that of assigning equal probabilities for a system to be in different equal regions of the phase space that correspond, to the same degree with what knowledge we do have as to the actual state of the system. And, as already intimated, we shall, of course, find that the results which can then be calculated as to the properties and behaviour of systems do agree with empirical. findings

(Tolman, 1938, p. 61)

This is somewhat reminiscent of an invocation of a Principle of Indifference, albeit not an incautious one that ignores the necessity of a choice of variables over which to impose uniformity. Subsequent authors have taken the Principle of Indifference as a foundational principle for statistical mechanics. See for example, Jackson (1968, p. 83). E. T. Jaynes’s Principle of Maximum Entropy (Jaynes, 1957a, 1957b) is a version of the Principle of Indifference.

#### 5.2 Probabilities from Dynamics

##### 5.2.1 Approaches Based on Long-Term Time Averages

One family of approaches seeks to characterize probabilities in terms of long-term time averages. If the fraction of time spent in a given subset of a system’s state space during a given duration of time approaches a limit as the duration considered is increased indefinitely, then this fraction is an objective physical fact about the system. Moreover, both empirical evidence and theoretical considerations suggest that the value of this fraction will, for macroscopic systems, be largely insensitive to initial conditions, and be determined solely by the total energy of the system and the external constraints imposed on it.

This conception, like so much of statistical mechanics, has its origin in the work of Boltzmann, who conjectured that,

The great irregularity of the thermal motion and the multitude of forces that act on a body make it probable that its atoms, due to the motion that we call heat, traverse all positions and velocities which are compatible with the principle of [conservation of] energy.

(Quoted in Uffink, 2007, p. 40, from Boltzmann, 1871, p. 707, 1909a, p. 284)

This (or rather, a variant of it) has come to be known as the ergodic hypothesis. Taken literally, it cannot be correct as stated, as was proven by Plancherel (1913) and Rosenthal (1913). But it can be true that every set of nonzero measure is eventually entered by almost all trajectories, that is, all except for a set of zero measure. A system for which this is true is called ergodic, in the terminology that has been adopted by mathematicians. Ergodicity is equivalent to there being no constants of the motion other than energy. It follows from a theorem due to Birkhoff that, for an ergodic system, for almost all trajectories, the long-term limiting value of the fraction of time spent in a given subset of the available state space is proportional to the measure of that set (Birkhoff, 1931a, 1931b). Proving ergodicity is difficult for anything like a realistic system. Moreover, there are systems, namely, those to which the KAM theorem applies, that are provably not ergodic. See Arnold (1989), Appendix 8, for exposition of the KAM theorem, and Berkovitz et al. (2006) for discussion of the applicability of ergodic theory to physics.

Darrigol has argued that taking Boltzmann to be assuming ergodicity is a misreading, and that Boltzmann should be read as assuming what Darrigol has called Boltzmann’s hypothesis,

since we know by experience that under given macroscopic conditions an isolated thermodynamic system evolves toward a well-defined state of equilibrium, we should naturally assume that for all practical purposes the long-term averaged behavior of the system does not depend on the original “phase” or microstate, as long as this microstate remains compatible with the initial macroscopic conditions.

(Darrigol, 2018a, p. 558)

The phrase “for all practical purposes” is to be understood as requiring that “for times of the order of the empirically known thermalization periods, the time averages of accessible physical properties do not sensibly depend on the initial conditions, with negligible exceptions” (Darrigol, 2018a, p. 559).

This is weaker than ergodicity in one respect, as it refers to the time averages, not of all physical quantities, but of accessible physical properties. It can thus be satisfied by systems that are not ergodic. It is also stronger than ergodicity in one respect, as a timescale is specified, whereas the condition of ergodicity has to do with average behavior in the infinite limit, which by itself carries no implications for behavior in any finite time period, no matter how large.

Behavior of the sort prescribed by Boltzmann’s hypothesis can be demonstrated for simple, idealized systems; this, together with empirical observations, lends credibility to the thought that the hypothesis holds for realistic systems.

Suppose it is established that the Boltzmann hypothesis, or some other hypothesis concerning long-term averaged behavior, holds for some system of interest. The question arises of the relevance of long-term averages to probabilities of the results of observations.

For a system that has been left to evolve by itself for a time long compared to the system’s timescale of relaxation to equilibrium, and then observed at a random time, it is appropriate to equate the probability of finding the system in a given macrostate with the long-time average. This does not, however, help in setting near-term expectations for a system prepared out of equilibrium.

Another argument that has been given for considering the long-term time average is as follows (adapted from Khinchin, 1949, pp. 44–45). Measurements of thermodynamic variables such as, say, temperature, are not instantaneous but have durations, which, though short on human timescales, are long on the timescales of molecular evolution. What is measured, then, is in effect a time average over a time period that counts as a very long time period on the relevant scale.

This rationale is problematic. The timescales of measurement, though long, are not long enough that the average over them necessarily approximates an average over unlimited time. As Sklar (1993, p. 176) pointed out, if they were, then the only measured values obtainable for thermodynamic quantities would be equilibrium values. This, as Sklar put it, is “patently false”; the approach to equilibrium can, in fact, be tracked by measuring changes in thermodynamic variables.

If one is to ask for a probability distribution appropriate to thermodynamic equilibrium, the distribution should be a stationary distribution. The microcanonical distribution is a stationary distribution on $ΓE$. If the system is ergodic, then it is the only stationary distribution among those that assign probability zero to the same sets that it does. For a justification of the use of the microcanonical distribution along these lines, see Malament and Zabell (1980).

##### 5.2.2 Convergence of Probability Distributions

For appropriate sorts of dynamics there can be convergence, or approximate convergence, of probability distributions over initial conditions to limit distributions. This is one way to think about the process of equilibration. This sort of convergence has been extensively studied by mathematicians; results in this area are often (misleadingly) referred to as the method of arbitrary functions. Ideas of this sort have drawn the attention of philosophers; see Abrams (2012), Myrvold (2012, 2021a), Rosenthal (2010, 2012), and Strevens (2003, 2011) for an array of approaches in which the method of arbitrary functions plays a role.

These results can pick out an appropriate equilibrium distribution; it is the distribution toward which others converge. It cannot, however, generate probabilities out of nothing, as it requires probability distributions for input. Hence, any use of the method must address the question of the status of the input distributions. Poincaré (1902, p. 234) said that input distributions may be imposed as an arbitrary convention. Strevens (2003) was noncommittal on the interpretation of the input probabilities, whereas Strevens (2011) and Abrams (2012) opted for distributions based on actual frequencies.

Savage (1973) suggested that the input probabilities be given a subjectivist interpretation. For the right sorts of dynamics, widely differing probabilities about initial conditions will lead to probability distributions about later states of affairs that agree closely on outcomes of feasible measurements; hence, the output probabilities might be called “almost objective” probabilities. This suggestion has been developed in Myrvold (2012, 2021a), where the resulting probabilities are called epistemic chances, to highlight their hybrid nature: they combine both epistemic and physical considerations.

##### 5.2.3 Probabilities from Quantum Mechanics?

Unlike classical mechanics, quantum mechanics invokes probabilities in its very formulation. Can quantum-mechanical probabilities account for all use of probability in statistical mechanics, including classical statistical mechanics? An affirmative answer to this question has been argued by Wallace (2016), and by Albrecht and Phillips (2014), who estimated the relevance of quantum uncertainty to stock examples such as coin flips and billiard-ball gases, and conclude that “all successful applications of probability to describe nature can be traced to quantum origins.”

As emphasized by Albert (2000, chap. 7), if isolated quantum systems are assumed to follow the usual Schrödinger evolution, then this leaves the conceptual situation pretty much the same as in classical mechanics. The dynamics governing the wave-function are reversible; for any state that approaches equilibrium, there is another state that moves away from it. Considering non-isolated systems only pushes the problems further out. The state of the system of interest plus a sufficiently large environment can be treated as an isolated system, and there will be states of this larger system that lead to a failure of equilibration for the subsystem of interest.

There are, however, approaches to the interpretation of quantum mechanics that hold that quantum state collapse is a genuinely chancy dynamical process. See Bassi and Ghirardi (2003) and Ghirardi and Bassi (2020) for overviews of dynamical collapse theories. On a dynamical collapse theory, for any initial state, there are objective probabilities for any subsequent evolution of the system. Albert (1994, 2000) has argued that these probabilities suffice to do the job required of them in statistical mechanics.

If this proposal is correct, then, on timescales expected for relaxation to equilibrium, the probability distribution yielded by the collapse dynamics will approach a distribution that is appropriately like the standard equilibrium distribution, where “appropriately like” means that it yields approximately the same probability distributions for measurable quantities. It is not to be expected that the equilibrium distribution be an exact limiting distribution for long time intervals; in fact, distributions that are stationary under the usual dynamics will not be strictly stationary under the stochastic evolution of dynamical collapse theories such as the Ghirardi-Rimini-Weber (GRW) or Continuous Spontaneous Localization (CSL) theory, as energy is not conserved in these theories. However, the energy increase will be so small as to be undetectable under ordinary circumstances—Bassi and Ghirardi (2003, p. 310) estimated, for a macroscopic monatomic ideal gas, a temperature increase on the order of 10−15 Celsius degrees per year—and so these models entail relaxation to something closely approximating a standard equilibrium distribution, on the timescales expected to happen, followed by exceedingly slow warming.

There is a worry about the project of explaining equilibration by reference to dynamical collapse, mentioned by Albert (2000, pp. 156–159) as having been raised by Larry Sklar and Philip Pearle. This sort of worry has to do with equilibration, or lack thereof, in systems for which a good approximation to unitary evolution would be expected. One such case would be a gas consisting of around 105 molecules. Would it have a tendency to spread out, if originally confined to a small region of the available volume? If so, then this tendency cannot be attributed to the stochasticity of GRW dynamics, as, for a system that small, the theory predicts a close approximation to unitary evolution. Another case has to do with spin-echo experiments, in which a tendency to approach a state that looks macroscopically random can be observed, and, yet, demonstrably, the evolution has been unitary, or close to it; otherwise, it would not be possible to restore the initial state by the reversing pulse (see Hahn, 1953 for exposition). In both cases, there exist states that do not tend to equilibrate, but these are sensitive to small perturbations, and it would be extraordinarily difficult to reliably prepare them, and they are not expected to occur in nature. There is, it seems, an ineliminable role for uncertainty about initial conditions to play, even in the quantum context.

### 6. Explaining Equilibration

It is typically assumed that a system confined to a bounded region of space will, if left to itself, eventually relax to an equilibrium state. Though not traditionally regarded as a law of thermodynamics, an assumption of this sort has been counted among the laws of thermodynamics by some authors (Brown & Uffink, 2001; Uhlenbeck & Ford, 1963).

This is a temporally asymmetric phenomenon, at least on the usual way of thinking of such things; an isolated system that is not in equilibrium is likely to be closer to equilibrium in the near future and to have been further from equilibrium in the near past. Temporal asymmetry of this sort is not built into the underlying dynamics. If a temporally asymmetric conclusion is to be obtained, some other considerations must be brought to bear.

Though the reversibility argument for this conclusion invokes dynamics invariant under time-reversal, this is not really the main issue. The issue remains in full force if the dynamics are Hamiltonian but not T-invariant. On the standard model, the weak force is not invariant under time-reversal but is invariant under the operation CPT, which combines time-reversal with charge conjugation and parity inversion. A universal tendency to equilibrium, encompassing both matter and antimatter, breaks CPT symmetry every bit as much as it breaks T symmetry.

Equilibration involves a loss of distinguishability at the macroscopic level. Macroscopically distinguishable states relax to the same equilibrium state, and, once a system has relaxed to equilibrium, it is not possible to recover the past macrostate from the current macrostate. This is in contrast to how things are at the microscopic level. Since, in classical mechanics, there is a measure that is conserved under Hamiltonian evolution, two sets of states with measure-zero overlap evolve into sets with measure-zero overlap, and, more generally, the extent of the overlap between any two sets of states is conserved under Hamiltonian evolution. In that sense, distinguishability of sets of states is conserved. In quantum mechanics, there is a conserved inner product, and orthogonal states evolve into orthogonal states. The salient tension between micro- and macrodynamics is not between time-reversible microdynamics and temporal asymmetry of the process of equilibration, but between conservation of distinguishability at the microphysical level and loss of distinguishability at the macrophysical level. This loss of distinguishability is possible because attention is confined to a restricted set of degrees of freedom of the system under consideration, and not the full specification of the state. This can be done in two ways: either by considering an isolated system, restricting attention to the evolution of a limited set of degrees of freedom, the macrovariables, or else by considering a system interacting with its environment, restricting attention to the evolution of the state of the system of interest. Though these might appear at first glance to be radically different approaches, conceptually they amount to pretty much the same thing. Both approaches involve tracking the evolution of a limited set of degrees of freedom of a larger system. The larger system is itself treated as isolated, and hence undergoing Hamiltonian evolution.

Because there is no hope of demonstrating approach to equilibrium in the near future for all initial states, considerations of probability are often brought into play. A different, though related, approach eschews talk of probabilities in favor of considerations of typicality. In this sort of approach, judgements about what is probable or improbable are replaced by judgements about behavior of typical states, on some way of judging typicality. Representative expositions of such a view include Goldstein (2001, 2012), Lebowitz (1999a), and Price (2002). See Frigg (2009, 2011), Hemmo and Shenker (2012), and Pitowksy (2012) for critiques of these approaches; for recent elaborations on and defenses of the approach, see Allori (2020b), Badino (2020), Crane and Wilhelm (2020), Frigg and Werndl (2012), Hubert (2021), Maudlin (2020), and Werndl (2013).

There are three main lines of approach to explanation of equilibration, all of which have their roots in the work of Boltzmann.

One involves assumption of some temporally asymmetric condition on microstates, or else a temporally asymmetric assumption about probabilities of microstates. This is exemplified by what the Ehrenfest and Ehrenfest (1912) dubbed the Stoßzahlansatz, and the condition that Boltzmann called “molecular disorder.” This is the condition that the states of molecules about to collide can, without appreciable error, be taken to be statistically independent of each other (Boltzmann, 1895a, 1896, pp. 20–22, 1964, pp. 40–42). It is temporally asymmetric, because, for a system approaching equilibrium, the corresponding assumption about the states of molecules that have recently collided cannot be made; having just interacted and possibly exchanged energy, their energies will be negatively correlated.

Assumptions of this sort may be motivated by a temporally asymmetric notion of causality. The idea is that correlations between the states of systems are to be explained by events in the past, and not by reference to events in the future. See Penrose (2001) and Penrose and Percival (1962) for proposals explicitly based on a temporally asymmetric causality assumption. One complication for this approach is that, for any pair of molecules about to collide, there are always events in their common past that could have correlated them; the same two molecules may have previously collided, or may have both interacted with another, intermediary molecule. Boltzmann’s rationale for treating the states as molecules as independent prior to a collision is that in a dilute gas a given molecule travels a long distance between two successive collisions, so that the collision environment is completely changed and is independent of the motion of the molecule (see Darrigol, 2018a, pp. 231, 379). Thus, though it is not possible for a gas to remain more than instantaneously in a state in which there are no correlations between molecules, it is possible for these correlations to be largely irrelevant, for a long time, for the evolution of features of interest, such as the distribution of velocities among molecules.

Another option is to adopt a time-symmetric posit about probabilities of microstates, or else about typicality of microstates. In the classical context, this may take the form: given a system’s current macrostate, take the restriction of the Liouville measure to that macrostate. This probability distribution attributes the same probability to any subset of the state space as it does to the set that arises from it by reversing all velocities. Similarly, if one adopts the Liouville measure as a measure of typicality, judgements of typicality of sets of states will be invariant under velocity reversal. If a posit of this sort is imposed at some time t0, then, provided that the dynamics are time-reversal invariant, if the project of showing that the posit entails high probability of equilibration to the future of time t succeeds, this carries with it the unwanted consequence that, if the system was governed by the same laws of evolution in the time preceding t0, it was, with high probability, closer to equilibrium to the past of t0. In the same vein, if the Liouville measure is used for judgements of typicality, then, if the project of showing that equilibration toward the future is typical succeeds, then this carries the unwanted conclusion that equilibration toward the past is typical.

Some of the discussions of probabilities in statistical mechanics seem to be motivated by the thought that probability is merely a matter of counting possibilities. If this thought is rejected, then there is no reason to employ a time-symmetric posit about probability (or typicality) at any time. If, nonetheless, a time-symmetric posit is to be employed, it can only be appropriate either only for a special time, prior to which no retrodictions are to be made—that is, a first instant of the universe, or a first instant at which current physics can be regarded as reliable—or at the peak of a temporary fluctuation away from equilibrium. These alternatives correspond to the two main lines of approach among those who employ a time-symmetric probability or typicality posit.

On one approach, one posits a far-from-equilibrium initial macrostate for the early universe, and a uniform probability distribution over microstates compatible with it. Such a posit has come to be called a “Past Hypothesis,” following Albert (2000). On the other approach, the universe as a whole is taken to be in an overall state of thermodynamic equilibrium, with rare, small fluctuations away from equilibrium macroconditions, and even rarer large fluctuations. If the universe is large enough, there might even exist spontaneous fluctuations corresponding to the conditions found in the portion of the universe observable from Earth, thought of as a tiny portion of the universe as a whole.

Both of these posits have their roots in the work of Boltzmann. (For the “Past Hypothesis,” see Boltzmann, 1898, pp. 255–256, 1964, p. 444. For the conjecture that the universe as a whole is in a state of equilibrium, attributed by Boltzmann to his “old assistant, Dr. Schütz,” see Boltzmann, 1895b, 1898, pp. 256–259, 1964, pp. 446-448). The two approaches differ in their accounts of observed temporal asymmetry. On the Past Hypothesis, temporal asymmetry is a real feature of the universe, having to do with initial conditions. The posit is temporally asymmetric because it involves an assumption about far-from-equilibrium initial conditions with no corresponding assumption about future conditions. Under the Boltzmann-Schütz cosmology, there is no temporal asymmetry of the universe as a whole; the temporal asymmetry of the behavior of everything that is observed by humans is merely local, and there will also be pockets of the universe in which equilibration takes place in the temporal direction opposite from what we call the future (the denizens of such regions will, of course, regard the temporal direction of the approach to equilibrium as the future direction).

The three approaches manifest three different attitudes toward explanation of the ubiquitous temporal asymmetries observed all around us. The first approach attempts to explain observed temporal asymmetry by reference to a temporally asymmetric notion of causality. The second posits a temporal asymmetry, without explanation. The third explains it away; on the Boltzmann-Schütz cosmology, the universe, despite all appearances to us, has no temporal asymmetry.

It has been argued that there is a consequence of the Boltzmann-Schütz cosmology that Boltzmann seems not to have noticed. On such a scenario, the vast majority of occurrences of a given level of departure from equilibrium would be near a local maximum. Furthermore, the argument goes, macrostates such as the one observed, complete with what appear to be records of a lengthy past even further from equilibrium, will more commonly occur as the result of fluctuations—in which case the apparent records are illusory—than as the result of an actual history of the sort that the apparent records indicate. If this is correct, then, on the Boltzmann-Schütz cosmology, it should be regarded as overwhelmingly probable that both the future and the near past are conditions closer to equilibrium. Taking such reasoning to its inevitable consequence, one should take oneself to be whatever the minimal physical system is that is capable of supporting experiences; apparent experiences of being surrounded by an abundance of far-from-equilibrium matter is also an illusion. That is, one should take oneself to be what has been called a “Boltzmann brain.” (The term is due to Andreas Albrecht. It first appears in print in Albrecht & Sorbo, 2004).

This is a logically possible scenario. However, not only does it involve rejection of judgements of what is typical that are based on experience, in which out-of-equilibrium systems are ubiquitous, it even goes as far as to involve rejection of all of one’s experience as illusory. Yet these considerations result from physics, physics based on empirical evidence that the world is to be described, at least approximately, as a large number of molecules evolving according to Hamiltonian dynamics. Thus, the hypothesis undermines its own empirical base (Albert, 2000, p. 116). It has been argued that an empirically self-undermining hypothesis of this sort should be assigned very low credence (Carroll, 2021). See Kotzen (2021) for further discussion of the epistemic status of such hypotheses.

The phrase “Gibbs paradox” has been used for two clusters of issues, both connected with the entropy of mixing, that have been thought by some to be paradoxical.

One cluster of issues belongs to thermodynamics and has to do with the fact that the entropy of mixing of two distinct gases at the same temperature and pressure is independent of the nature of the gases, contradicting an assumption that has seemed plausible to some, namely, for entropy calculations, the case of identical gases should either be treated in the same way as distinct gases, as suggested by Duhem (1892), or as a limiting case of distinct gases, as their degree of dissimilarity is decreased, as suggested by Wiedeberg (1894). The other cluster of issues has to do with extensivity of entropy. A straightforward application of either the formula (17) for the Gibbs entropy or Planck’s formula (1) for what has come to be called Boltzmann entropy, yields an entropy that is not extensive. This has also seemed paradoxical to some and has been taken to indicate something profound about the nature of identical particles. See Darrigol (2018b) for a historical overview of discussions of these two issues.

#### 7.1 The Original, Thermodynamical, Paradox

The first to assert that the entropy of mixing engenders a paradox was Duhem (1892), who, citing Neumann (1891), said that Gibbs’s result for the entropy of mixing of two gases has a “paradoxical consequence.” The argument is the following. Let S1(T, V, N) be the entropy of N moles of a uniform gas in volume V at temperature T. (In this section Si is used to denote entropy of a gas that consists of i distinct types of molecules.) Suppose that the gas is conceptually divided into two subvolumes Va and Vb, containing Na and Nb moles of gas, respectively. As the density of the gas is constant throughout the volume V, $Va/Na=Vb/Nb$. The entropy S1 is extensive. This means that, if $Va/Na=Vb/Nb$, then

$Display mathematics$(31)

The entropy of the whole gas is the sum of the entropies associated with the subvolumes.

For a system consisting of two distinct gases, each of which behaves as an ideal gas, the thermodynamic state is a function, not only of the total quantity of gas and the volume occupied, but of the quantity and volume of each separately.

$Display mathematics$(32)

Consider, now, the case of two distinct ideal gases at the same temperature T initially confined to disjoint volumes Va, Vb, then allowed to interdiffuse isothermically to fill the combined volume $Va+Vb$. The initial entropy is

$Display mathematics$(33)

and the final entropy is

$Display mathematics$(34)

This process involves an increase of entropy:

$Display mathematics$(35)

This increase of entropy is the entropy of mixing. As Gibbs (1875, pp. 227–228, 1906, p. 166) remarked, the entropy of mixing does not depend on the nature of the two gases being mixed, only on the fact that they are different. In particular, it is independent of the degree of difference between them.

Now consider the case in which the gases are not distinct. Two samples of the same type of gas are initially confined to disjoint volumes Va, Vb, then allowed to interdiffuse isothermically to fill the combined volume $Va+Vb$. The initial and final thermodynamic states are the same, and thus have the same thermodynamic entropy. If, however, formula (34), rather than (31), were to be applied to this case, this would entail that the process involves an increase of entropy.

The notion of mixing of any two gases necessarily implies the notion of mixing of two masses of gases of the same nature; so that any proposition true for a mixture of any two gases must remain true for a mixture of two masses of gases of the same nature.

(Duhem, 1892, p. 54)

Duhem’s conclusion is that this premise must be rejected.

Wiedeburg, who was the first to speak of a Gibbs paradox, took the paradox to be a contradiction between the fact, already remarked by Gibbs, that the entropy of mixing is independent of the nature of the gases (as long as they are distinct), and the idea that a result valid for identical gases should be obtainable as a limiting case of distinct gases of diminishing degree of dissimilarity. This tension between discontinuous dependence of entropy of mixing on degree of dissimilarity and the thought that any meaningful physical properties of a system should depend continuously on its physical parameters is the original version of the Gibbs paradox.

As Denbigh and Redhead (1983) argued, the initial sense that there is something odd about this behavior of entropy can be alleviated by reflecting on the fact that thermodynamic entropy is itself a limit concept. The entropy difference $ΔS$ between two thermodynamic states is calculated by calculating the integral of $đQ/T$ along a reversible process, and the concept of a reversible process is a limit concept. Any actual process involves some dissipation of energy. In classical, macroscopic thermodynamics, it is assumed that a reversible process can be approximated arbitrarily closely, by considering processes that are very slow. The least upper bound of $∫đQ/T$, taken over the set of all actual processes, is $ΔS$. The result of any actual process, taking placing within a fixed duration of time, will depend continuously on the relevant physical parameters, though the quasi-static limit of a sequence of increasingly slow processes might not.

#### 7.2 Identity, Indiscernibility, and Extensivity of Entropy

The arguments presented in section 3.2 for taking either SG or SB to be a statistical-mechanical analogue of entropy involve entropy differences between states in which N, the total number of molecules, is unchanged. They therefore yield no information about the dependence of entropy on N when states of differing N are considered. That is, they are arguments that, for fixed N, $ΔSG$ and $ΔSB$ approximate differences in thermodynamic entropy; these conclusions are unchanged if an arbitrary function of N is added to them.

Suppose that, nonetheless, one or the other of the quantities SB and SG were adopted as yielding an absolute entropy, valid without qualification, yielding entropy differences between states of differing N as well as states of the same N. Applied to a classical monatomic ideal gas, the entropy formula (30) yields,

$Display mathematics$(36)

where $ω$ is the Liouville measure of an element of the partition of the phase space of an individual molecule. A straightforward application of the Gibbs entropy formula (17) yields a similar result.

The entropy defined by equation (36) is not an extensive quantity.

$Display mathematics$(37)

The excess on the right-hand side of this is just the entropy of mixing.

If entropy is taken to be additive when applied to an array of energetically isolated systems, accepting (36) as entropy has the consequence that the entropy of a gas in a container can be lowered simply by inserting an insulated partition into the container, and raised again by removing it. The fact that a straightforward application of either the formula (30) or (17) yields a non-extensive entropy, with characteristics that many take to be absurd, has, confusingly, also been called Gibbs’s paradox.

If k log(N!) is subtracted from Snonext, applying again the Stirling approximation, the result is,

$Display mathematics$(38)

which is extensive. This is equivalent to dividing the permutability $P$ by N!, the number of permutations of N things. If this is taken to yield the correct entropy, the question arises as to the physical significance of the division by N!.

Ehrenfest and Trkal (1921) questioned the necessity of extensivity of entropy, and non-extensive entropies have since been defended by several authors (Dieks, 2018; Grad, 1961; Versteegh & Dieks, 2011). This requires accepting a statistical-mechanical entropy whose behavior is different from that of thermodynamic entropy.

A more common view is that, for a system composed of non-interacting subsystems, extensivity is appropriate. As has already been mentioned, Boltzmann in his own treatment simply dropped terms that are constant for constant N, yielding an extensive entropy. Several authors, beginning with Ehrenfest and Trkal (1921), have argued that the correct dependence on N cannot be obtained from considering an isolated system; it can be determined only by considering a situation in which the number of molecules in the system can vary, which means considering it, not in isolation, but as a subsystem of a larger system containing other subsystems with which it can exchange particles. A combinatorial calculation leads to the correct division by N!. See also Swendsen (2018), van Kampen (1984), and van Lith (2018).

Another approach utilizes a reduced state-space on which states related by a permutation of identical particles are identified. In classical physics, this is the space whose elements are what Gibbs called “generic phases,” that is, an unordered list of particle positions and momenta. This reduced phase space is the quotient of ordinary phase space under the group of permutations of identical particles. Consideration of the volume of reduced phase space corresponding to a given state description yields the extensive entropy (38).

In quantum mechanics, the identification of states that differ only by permutation of identical particles is guaranteed by the requirement that the state be symmetric (for bosons) or antisymmetric (for fermions) under interchange of identical particles. Some authors (see, e.g., Huang, 1986, p. 154) claim that the move to a reduced state space cannot be justified in classical terms, and the reasons must be quantum mechanical. Others (e.g., Allis & Herlin, 1952) employ a classical treatment that takes indistinguishability into account. For extended discussions of the rationale for use of reduced state space in classical mechanics, see Saunders (2013, 2018, 2020).

Discussions of this issue tend to presuppose that a move to the reduced state space (or, equivalently, division by N!) is justifiable only if the particles are identical, that is, only if they have the same values of state-independent properties such as mass and charge. On such a view, the non-extensive entropy (36) would be correct for a collection of distinguishable particles, which entails that for such a collection, there is always an entropy of mixing. Swendsen (2006, 2018) has argued that this is problematic when applied to a colloidal suspension. A colloid, such as paint, or milk, consists of colloidal particles suspended in a fluid. The colloidal particles may be large enough that each contains a large number of molecules, and, though their sizes and composition may be sufficiently uniform that the colloid does not vary from place to place in its macroscopically observable properties, it might be that no two of them contain exactly the same number of atoms. A commitment to the position that for a collection of distinct particles there is always an entropy of mixing when a partition is removed entails that the entropy of a can of paint can be lowered or raised merely by inserting or removing a partition.

### 8. Maxwell’s Demon

In his Theory of Heat (1871), Maxwell introduced to the world what he had earlier called in correspondence a “very observant and neat-fingered being,” capable of producing pressure and temperature differences in a gas without expenditure of work. This imaginary being was intended to illustrate a limitation of the second law of thermodynamics. One limitation of the second law, as originally conceived, has already been mentioned. Fluctuations at the molecular level entail that the law as originally formulated cannot hold strictly; deviations from what it dictates that are large enough to be noticeable may be improbable, but they are not impossible. According to Maxwell, there is a further restriction. The scope of even a probabilistic version is limited to situations in which molecules are dealt with in bulk, and not subject to individual manipulations. The second law is a statistical law, of the same sort as found in the statistics of populations, in which certain aggregate quantities, such as the number of traffic accidents per year in a given city, display substantial regularity even though the individual events are unpredictable.

For Maxwell, what the demon achieved was “at present impossible to us” (1871, p. 208), leaving open the possibility that further technological advances might permit the construction of a device that achieved what the demon was meant to do. This has given rise to a vast and bewildering literature on whether there are fundamental physical principles that prevent the construction of such a device.

To address this question, it is, of course, essential to be clear about what is being asked. The thermodynamic state of a system, as has been stressed in this article, is defined in terms of a set of selected variables. Someone possessed of means of manipulating a system on the basis of variables unaccounted for in a definition of entropy may well be able to decrease a system’s entropy, defined in terms of those variables. To see this, consider again the entropy of mixing. If two samples of gas differ with respect to some variable that has not been accounted for in the characterization of the thermodynamic state, a process in which the gases are reversibly expanded by means of membranes permeable to one type of gas but not the other would appear to be one in which heat has been transformed entirely into work, with no compensating increase of entropy, as initial and final states of the gas are being counted as the same thermodynamic state. With a definition of thermodynamic state that respects the distinction between the gases, the process is clearly not a violation of the second law.

A process of that sort ought not to be regarded as cyclic, as it cannot be repeated. In this connection it is helpful to distinguish between two sorts of violations of the second law of thermodynamics. Earman and Norton (1998) distinguished between straight and embellished violations of the second law of thermodynamics. A straight violation decreases the entropy of an adiabatically isolated system, without compensatory increase of entropy elsewhere. An embellished violation exploits such decreases in entropy reliably to provide work. In a similar vein, Wallace (2018b) distinguished between two types of demon. A demon of the first kind decreases some kind of coarse-grained entropy. A demon of the second kind violates the Carnot bound on efficiency of a heat engine over a repeatable cycle that restores the state of the demon plus any auxiliary system utilized to its original thermodynamic state. A demon of the first kind serves to illustrate the dependency of entropy on the class of manipulations considered (or choice of macrovariables for coarse-graining); a demon of the second kind would undermine the foundations of thermodynamics.

The question to be asked, then, is whether a demon of the second kind, a demon that could produce embellished violations, is possible. The answer from statistical mechanics is negative, as long as the dynamical evolution of the demon is Hamiltonian. This follows from equation (20); see also Norton (2013c). This is a necessary condition, as is illustrated by Earman and Norton (1999), who, building on the work of Skordos (1993) and Zhang and Zhang (1992), exhibited a fictitious system with dynamics that is time-reversal invariant, conserves energy, but does not conserve phase-space volume, that completely converts heat drawn from a heat reservoir into work.

### 9. Thermodynamics of Computation and Landauer’s Principle

There is an extensive and somewhat contentious literature on the thermodynamics of computation, which centers on the issue of whether there are in-principle lower bounds on energy dissipated in a computational process. For a selection of the relevant literature, with an annotated bibliography, see Leff and Rex (2003).

Von Neumann is reported to have estimated a minimum energy dissipation of kT log n per elementary act of computation, for a device operating at temperature T, where n is the number of alternative possibilities for the step taken (von Neumann, 1966, p. 66). More recently, this has been disputed; it is argued that there is no minimum dissipation associated with an act of computation, as long as the computation is logically reversible (see Bennett, 1982). A logically reversible operation is one such that the input logical state can be recovered from the output state; it does not merge computational pathways. Logically irreversible operations are typified by erasure, which sets a memory register to some specified state, independently of its initial state. It is now widely held that there is an unavoidable dissipation associated with logically reversible operations: an operation that takes each of a set of N input states to the same output state has associated with it an average dissipation of at least kT log N. This assertion has come to be known as Landauer’s principle, after Landauer (1961).

Landauer’s principle is sometimes cited as an explanation for why some proposed implementation of a Maxwell demon of the second kind could not work. A demon that carries out a measurement on some system and then performs an operation that depends on the outcome of the measurement must temporarily store the result of the measurement in a memory register. If the demon operates in a cycle, this memory register must be reset in order to complete the cycle. According to Landauer’s principle, this operation carries with it an average entropy increase that offsets any entropy decreases effected during the rest of the cycle. Earman and Norton (1999) raised the question of whether, in a context like this, Landauer’s principle is being invoked as a principle independent of the statistical version of the second law of thermodynamics, in order to save the second law from a threatened violation, or as a consequence of the statistical second law. If the latter, it is of no avail in an argument against a skeptic who seeks to refute the second law by constructing in thought a system that would violate it. It can at best play a heuristic role in analyzing some proposed device that might appear to violate the second law, as a reminder that the device should operate in a cycle and that dissipations associated with resetting its state to the initial state should not be neglected. It is the latter interpretation that is accepted in the literature on the thermodynamics of computation; see, for example, Bennett (2003, p. 501), who suggested that, though Landauer’s principle is in a sense a “a straightforward consequence or restatement of the Second Law,” it nevertheless has considerable pedagogic value.

Though it is widely accepted in the literature on the thermodynamics of computation, Landauer’s principle has been the subject of considerable controversy in the philosophical literature (Bennett, 2003; Bub, 2001; Earman & Norton, 1999; Hemmo & Shenker, 2012, 2013, 2019; Ladyman, 2018; Ladyman et al., 2007, 2008; Ladyman & Robertson, 2013, 2014; Maroney, 2005, 2009; Myrvold, 2021b; Norton, 2005, 2011, 2013a, 2013b, 2013c, 2018).

### 10. Conclusion

Thermodynamics has been a subject of philosophical controversy for as long as the subject has existed. This is due, in part, to its peculiar status among the sciences. On the one hand, it is a subject of remarkable generality, embracing physical systems and processes of all sorts, and its use pervades all areas of physics. It thus appears to be foundational for modern physics. Einstein famously said, of thermodynamics, “It is the only physical theory of universal content concerning which I am convinced that, within the framework of applicability of its basic concepts, it will never be overthrown” (Einstein, 1949, p. 33). Nonetheless, it employs concepts, such as a distinction between work and heat, that seem alien to fundamental physics, and considerations such as reversibility show that thermodynamic phenomena are not to be recovered from the fundamental physical laws alone.

This has been a selective overview of some of the philosophical issues arising from thermodynamics. It is meant as a supplement to other overviews; see Further Reading. It has not been an exhaustive overview; some topics of interest have been omitted for reasons of space.

One topic on which there is extensive literature that has been omitted in this article is the issue of the nature of the thermodynamic limit and the role of idealizations in thermodynamics. See Fletcher et al. (2019) for an overview and pointers to the relevant literature; also Fraser (2016), Lavis et al. (2021), Palacios (2019), and Palacios and Valente (2021). Another is black hole thermodynamics. See Wallace (2018a, 2019) for an introduction to the relevant issues, and pointers to the literature.

• Callender, Craig (2021). Thermodynamic asymmetry in time. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2021 Edition).
• Frigg, R. (2008). A field guide to recent work on the foundations of statistical mechanics. In D. Rickles (Ed.), The Ashgate companion to contemporary philosophy of physics (pp. 99–196). Routledge.
• Maroney, O. (2009). Information processing and thermodynamic entropy. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Fall 2009 Edition).
• Sklar, L. (2015). Philosophy of statistical mechanics. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Fall 2015 Edition).
• Uffink, J. (2007). Compendium of the foundations of statistical physics. In J. Butterfield & J. Earman (Eds.), Handbook of the philosophy of science: Philosophy of physics (pp. 924–1074). North-Holland.
• Uffink, J. (2022). Boltzmann’s work in statistical physics. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2022 Edition).

#### References

• Abraham, E., & Penrose, O. (2017). Physics of negative absolute temperatures. Physical Review E, 95, 012125.
• Abrams, M. (2012). Mechanistic probability. Synthese, 187, 343–375.
• Albert, D. Z. (1994). The foundations of quantum mechanics and the approach to thermodynamic equilibrium. The British Journal for the Philosophy of Science, 45, 669–677.
• Albert, D. Z. (2000). Time and chance. Harvard University Press.
• Albrecht, A., & Phillips, D. (2014). Origin of probabilities and their application to the multiverse. Physical Review D, 90, 123514.
• Albrecht, A., & Sorbo, L. (2004). Can the universe afford inflation? Physical Review D, 70, 063528.
• Allis, W. P., & Herlin, M. A. (1952). Thermodynamics and statistical mechanics. McGraw-Hill.
• Allori, V. (Ed.). (2020a). Statistical mechanics and scientific explanation. World Scientific.
• Allori, V. (2020b). Some reflections on the statistical postulate: Typicality, probability and explanation between deterministic and indeterministic theories. In V. Allori (Ed.), Statistical mechanics and scientific explanation (pp. 65–111). World Scientific.
• Arnold, V. I. (1989). Mathematical methods of classical mechanics (2nd ed.). Springer-Verlag.
• Badino, M. (2020). Reassessing typicality explanations in statistical mechanics. In V. Allori (Ed.), Statistical mechanics and scientific explanation (pp. 147–172). World Scientific.
• Bassi, A., & Ghirardi, G. C. (2003). Dynamical reduction models. Physics Reports, 379, 257–426.
• Beisbart, C., & Hartmann, S. (Eds.). (2011). Probabilities in physics. Oxford University Press.
• Ben-Menahem, Y., & Hemmo, M. (Eds.). (2012). Probability in physics. Springer.
• Bennett, C. H. (1982). The thermodynamics of computation—A review. International Journal of Theoretical Physics, 21, 905–940 (Reprinted in Leff & Rex, 2003, pp. 283–318).
• Bennett, C. H. (2003). Notes on Landauer’s principle, reversible computation, and Maxwell’s demon. Studies in History and Philosophy of Modern Physics, 34, 501–510.
• Berger, A. (2001). Chaos and chance: An introduction to stochastic aspects of dynamics. Walter de Gruyter.
• Berkovitz, J., Frigg, R., & Kronz, F. (2006). The ergodic hierarchy, randomness and Hamiltonian chaos. Studies in History and Philosophy of Modern Physics, 37, 661–691.
• Billingsley, P. (2012). Probability and measure (Anniversary ed.). Wiley.
• Birkhoff, G. D. (1931a). Proof of a recurrence theorem for strongly transitive systems. Proceedings of the National Academy of Sciences, 17, 650–655.
• Birkhoff, G. D. (1931b). Proof of the ergodic theorem. Proceedings of the National Academy of Sciences, 17, 656–660.
• Boltzmann, L. (1871). Einige allgemeine Sätze über Wärmegleichgewichte. Sitzungberichte der Kaiserlichen Akademie der Wissenschaften: Mathematisch-Naturwissenschaftliche Classe, 63, 679–711 (Reprinted in Boltzmann, 1909a, pp. 259–287).
• Boltzmann, L. (1872). Weitere Studien über das Wärmegleichgewicht unter Gasmolekülen. Sitzungberichte der Kaiserlichen Akademie der Wissenschaften. Mathematisch-Naturwissenschaftliche Classe, 66, 275–370 (Reprinted in Boltzmann, 1909b, pp. 316–402; English translation in Boltzmann, 1966a).
• Boltzmann, L. (1877a). Bemerkungen über einige Probleme der mechanische Wärmetheorie. Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften: Mathematisch-Naturwissenschaftliche Classe, 75, 62–100 (Reprinted in Boltzmann, 1909b, pp. 113–148; English translation of Section II in Boltzmann, 1966b).
• Boltzmann, L. (1877b). Über die Beziehung zwischen dem zweiten Hauptsatze der mechanischen Wärmetheorie und der Wahrscheinlichkeitsrechnung resp. dem Sätzen über das Wärmegleichgewicht. Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften: Mathematisch-Naturwissenschaftliche Classe, 76, 373–435 (Reprinted in Boltzmann, 1909b, pp. 164–223; English translation in Sharp & Matschinsky, 2015).
• Boltzmann, L. (1878). Über die Beziehung der Diffusionsphänomene zum zweiten Hauptsatze der mechanischen Wärmetheorie. Die Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften, Wien: Mathematisch-Naturwissenschaften Klasse, 78, 733–763 (Reprinted in Boltzmann, 1909b, pp. 289–317).
• Boltzmann, L. (1884). Über die Eigenschaften monozyklsicher und anderer damit verwandter Systeme. Sitzungberichte der Kaiserlichen Akademie der Wissenschaften: Mathematisch-Naturwissenschaftliche Classe, 90, 231–245 (Reprinted in Boltzmann, 1909c, pp. 122–152).
• Boltzmann, L. (1895a). Nochmals das Maxwell’sche Vertheilungsgesetz der Geschwindigkeiten. Annalen der Physik, 55, 223–224 (Reprinted in Boltzmann, 1909c, pp. 532–543).
• Boltzmann, L. (1895b). On certain questions of the theory of gases. Nature, 51, 413–415.
• Boltzmann, L. (1896). Vorlesungen Über Gastheorie: I. Thiel. Verlag von Johann Ambrosius Barth.
• Boltzmann, L. (1898). Vorlesungen Über Gastheorie: II. Thiel. Verlag von Johann Ambrosius Barth.
• Boltzmann, L. (1909a). Wissenschaftliche Abhandlungen: I. Band. J. A. Barth.
• Boltzmann, L. (1909b). Wissenschaftliche Abhandlungen: II. Band. J. A. Barth.
• Boltzmann, L. (1909c). Wissenschaftliche Abhandlungen: III. Band. J. A. Barth.
• Boltzmann, L. (1964). Lectures on gas theory. University of California Press (English translation of Boltzmann, 1896, 1898).
• Boltzmann, L. (1966a). Further studies on the thermal equilibrium of gas molecules. In S. G. Brush (Ed.), Kinetic theory: Volume 2. Irreversible processes (pp. 88–175). Pergamon Press (English translation of Boltzmann, 1872).
• Boltzmann, L. (1966b). On the relation of a general mechanical theorem to the second law of thermodynamics. In S. G. Brush (Ed.), Kinetic theory: Volume 2. Irreversible processes (pp. 188–193). Pergamon Press. (English translation of section II of Boltzmann, 1877a)
• Bricmont, J., Dürr, D., Galavotti, M., Ghirardi, G., Petruccione, F., & Zanghì, N. (Eds.). (2001). Chance in physics. Springer.
• Brown, H. R., Myrvold, W., & Uffink, J. (2009). Boltzmann’s H-theorem, its discontents, and the birth of statistical mechanics. Studies in History and Philosophy of Modern Physics, 40, 174–191.
• Brown, H. R., & Uffink, J. (2001). The origins of time-asymmetry in thermodynamics: The minus first law. Studies in History and Philosophy of Modern Physics, 32, 525–538.
• Brush, S. G. (Ed.). (1966). Kinetic theory: Volume 2. Irreversible processes. Pergamon Press.
• Bub, J. (2001). Maxwell’s demon and the thermodynamics of computation. Studies in History and Philosophy of Modern Physics, 32, 569–579.
• Carnot, S. (1824). Réflexions sur la puissance motrice du feu et sur les machines propres a déveloper cette puissance. Bachelier (Translated in Carnot, 1890 and in Magie, 1899).
• Carnot, S. (1890). Reflections on the motive power of fire, and on machines fitted to develop that power. Macmillan (Translation of Carnot, 1824).
• Carroll, S. M. (2021). Why Boltzmann brains are bad. In S. Dasgupta, R. Dotan, & B. Weslake (Eds.), Current controversies in philosophy of science (pp. 7–20). Routledge.
• Clausius, R. (1854). Ueber eine veränderte Form des zweiten Hauptsatzes der mechanischen Wärmetheorie. Annalen der Physik, 93, 481–506 (Reprinted in Clausius, 1864, pp. 127–154; English translation in Clausius, 1856 and in Clausius, 1867b).
• Clausius, R. (1856). On a modified form of the second fundamental theorem in the mechanical theory of heat. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 12, 81–88 (English translation of Clausius, 1854).
• Clausius, R. (1864). Abhandlungen über die mechanische Wärmetheorie (Vol. 1). Friedrich Vieweg und Sohn.
• Clausius, R. (1865). Ueber verschiedene für die Anwendung bequeme Formen der Hauptgleichungen der mechanischen Wärmetheorie. Annalen der Physik, 125, 353–400 (Reprinted in Clausius, 1867a, pp. 1–44; English translation in Clausius, 1867b).
• Clausius, R. (1867a). Abhandlungen über die mechanische Wärmetheorie (Vol. 2). Friedrich Vieweg und Sohn.
• Clausius, R. (1867b). The mechanical theory of heat, with its applications to the steam engine, and to the physical properties of bodies. John van Voorst (English translation, with one additional paper, of Clausius, 1864).
• Crane, H., & Wilhelm, I. (2020). The logic of typicality. In V. Allori (Ed.), Statistical mechanics and scientific explanation (pp. 173–230). World Scientific.
• Darrigol, O. (2018a). Atoms, mechanics, and probability. Oxford University Press.
• Darrigol, O. (2018b). The Gibbs paradox: Early history and solutions. Entropy, 20, 443.
• Dasgupta, S., Dotan, R., & Weslake, B. (Eds.). (2021). Current controversies in philosophy of science. Routledge.
• Denbigh, K., & Redhead, M. (1983). Gibbs’ paradox and non-uniform convergence. Synthese, 81, 283–312.
• Dieks, D. (2018). The Gibbs paradox and particle individuality. Entropy, 20, 466.
• Duhem, P. (1892). Sur la dissociation dans les systèmes qui renferment un mélange de gaz parfaits. Travaux et mémoires des Facultés de Lille: Tome II. Mémoire, 8.
• Earman, J., & Norton, J. D. (1998). Exorcist XIV: The wrath of Maxwell’s Demon: Part I. From Maxwell to Szilard. Studies in History and Philosophy of Modern Physics, 29, 435–471.
• Earman, J., & Norton, J. D. (1999). Exorcist XIV: The wrath of Maxwell’s Demon: Part II. From Szilard to Landauer and beyond. Studies in History and Philosophy of Modern Physics, 30, 1–40.
• Ehrenfest, P., & Ehrenfest, T. (1912). Begriffliche Grundlagen der statistischen Auffassung in der Mechanik. Teubner. (English translation in Ehrenfest & Ehrenfest, 1959)
• Ehrenfest, P., & Ehrenfest, T. (1959). The conceptual foundations of the statistical approach in mechanics. Cornell University Press (English translation of Ehrenfest & Ehrenfest, 1912).
• Ehrenfest, P., & Trkal, V. (1921). Deduction of the dissociation-equilibrium from the theory of quanta and a calculation of the chemical constant based on this. Koninklijke Akademie van Wetenschappen te Amsterdam: Proceedings of the Section of the Sciences, 23, 162–183.
• Einstein, A. (1949). Autobiographical notes. In P. A. Schilpp (Ed.), Albert Einstein: Philosopher-scientist. Open Court.
• Fletcher, S. C., Palacios, P., Ruetsche, L., & Shech, E. (2019). Infinite idealizations in science: an introduction. Synthese, 196, 1657–1669.
• Fraser, J. D. (2016). Spontaneous symmetry breaking in finite systems. Philosophy of Science, 83, 585–605.
• Frigg, R. (2009). Typicality and the approach to equilibrium in Boltzmannian statistical mechanics. Philosophy of Science, 76, 997–1008.
• Frigg, R. (2011). Why typicality does not explain the approach to equilibrium. In M. Suárez (Ed.), Probabilities, causes and propensities in physics (pp. 77–93). Springer.
• Frigg, R., & Werndl, C. (2012). Demystifying typicality. Philosophy of Science, 79, 917–929.
• Ghirardi, G., & Bassi, A. (2020, Summer). Collapse theories. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. Metaphysics Research Lab, Stanford University.
• Gibbs, J. W. (1875). On the equilibrium of heterogeneous substances, part I. Transactions of the Connecticut Academy of Arts and Sciences, 3, 108–248. (Reprinted in Gibbs, 1906, pp. 55–184).
• Gibbs, J. W. (1902). Elementary principles in statistical mechanics: Developed with especial reference to the rational foundation of thermodynamics. Charles Scribner’s Sons.
• Gibbs, J. W. (1906). The scientific papers of J. Willard Gibbs, PhD, LLD (Vol. I). Longmans, Green.
• Goldstein, S. (2001). Boltzmann’s approach to statistical mechanics. In J. Bricmont, D. Dürr, M. Galavotti, G. Ghirardi, F. Petruccione, & N. Zanghì, N. (Eds.), Chance in physics (pp. 39–54). Springer.
• Goldstein, S. (2012). Typicality and notions of probability in physics. In Y. Ben-Menahem & M. Hemmo (Eds), Probability in physics (pp. 59–71). Springer.
• Goold, J., Huber, M., Riera, A., del Rio, L., & Skrzypczyk, P. (2016). The role of quantum information in thermodynamics—A topical review. Journal of Physics A: Mathematical and Theoretical, 49, 143001.
• Gour, G., Müller, M. P., Narasimhachar, V., Spekkens, R. W., & Halpern, N. Y. (2015). The resource theory of informational nonequilibrium in thermodynamics. Physics Reports, 583, 1–58.
• Grad, H. (1961). The many faces of entropy. Communications on Pure and Applied Mathematics, 14, 323–354.
• Hacking, I. (1975). The emergence of probability. Cambridge University Press.
• Hahn, E. L. (1953). Free nuclear induction. Physics Today, 6, 4–9.
• Harman, P. M. (Ed.). (1990). The scientific letters and papers of James Clerk Maxwell: Volume I. 1846–1862. Cambridge University Press.
• Hemmo, M., & Shenker, O. R. (2012). The road to Maxwell’s demon: Conceptual foundations of statistical mechanics. Cambridge University Press.
• Hemmo, M., & Shenker, O. (2013). Entropy and computation: The Landauer–Bennett thesis reexamined. Entropy, 15, 3297–3331.
• Hemmo, M., & Shenker, O. (2019). The physics of implementing logic: Landauer’s principle and the multiple-computations theorem. Studies in History and Philosophy of Modern Physics, 68, 90–105.
• Hilbert, S., Hänggi, P., & Dunkel, J. (2014). Thermodynamic laws in isolated systems. Physical Review E, 90, 062116.
• Huang, K. (1986). Statistical mechanics (2nd ed.). John Wiley & Sons.
• Hubert, M. (2021). Reviving frequentism. Synthese, 199, 5255–5284.
• Jackson, E. A. (1968). Equilibrium statistical mechanics. Dover.
• Jaynes, E. T. (1957a). Information theory and statistical mechanics, I. Physical Review, 106, 620–630 (Reprinted in Jaynes, 1989, pp. 7–16).
• Jaynes, E. T. (1957b). Information theory and statistical mechanics, II. Physical Review, 108, 171–190 (Reprinted in Jaynes, 1989, 19–37).
• Jaynes, E. T. (1965). Gibbs vs Boltzmann entropies. American Journal of Physics, 33, 391–398 (Reprinted in Jaynes, 1989, pp. 77–86).
• Jaynes, E. T. (1971). Violation of Boltzmann’s H theorem in real gases. Physical Review A, 4, 747–750.
• Jaynes, E. T. (1989). Papers on probability, statistics, and statistical physics. Kluwer Academic.
• Khinchin, A. I. (1949). Mathematical foundations of statistical mechanics. Dover.
• Knott, C. G. (1911). Life and scientific work of Peter Guthrie Tait. Cambridge University Press.
• Kotzen, M. (2021). What follows from the possibility of Boltzmann brains? In S. Dasgupta, R. Dotan, & B. Weslake (Eds.), Current controversies in philosophy of science (pp. 21–34). Routledge.
• La Caze, A. (2016). Frequentism. In A. Hájek & C. Hitchcock (Eds.), The Oxford handbook of probability and philosophy (pp. 341–359). Oxford University Press.
• Ladyman, J. (2018). Intension in the physics of computation: Lessons from the debate about Landauer’s principle. In M. E. Cuffaro & S. C. Fletcher (Eds.), Physical perspectives on computation, computational perspectives in physics (pp. 219–239). Cambridge University Press.
• Ladyman, J., Presnell, S., & Short, A. J. (2008). The use of the information-theoretic entropy in thermodynamics. Studies in History and Philosophy of Modern Physics, 39, 315–324.
• Ladyman, J., Presnell, S., Short, A. J., & Groisman, B. (2007). The connection between logical and thermodynamic irreversibility. Studies in History and Philosophy of Modern Physics, 38, 58–79.
• Ladyman, J., & Robertson, K. (2013). Landauer defended: Reply to Norton. Studies in History and Philosophy of Modern Physics, 44, 263–271.
• Ladyman, J., & Robertson, K. (2014). Going round in circles: Landauer vs. Norton on the thermodynamics of computation. Entropy, 16, 2278–2290.
• Landauer, R. (1961). Irreversibility and heat generation in the computing process. IBM Journal of Research and Development, 5, 183–191. (Reprinted in Leff & Rex, 2003, pp. 148–156)
• Laplace, P. S. (1814). Essai philosophique sur les probabilités. Courcier (English translation in Laplace, 1902).
• Laplace, P. S. (1902). A philosophical essay on probabilities. John Wiley & Sons (Translation of Laplace, 1814).
• Lavis, D. A. (2019). The question of negative temperatures in thermodynamics and statistical mechanics. Studies in History and Philosophy of Modern Physics, 67, 26–63.
• Lavis, D. A., Kühn, R., & Frigg, R. (2021). Becoming large, becoming infinite: The anatomy of thermal physics and phase transitions in finite systems. Foundations of Physics, 51, 90.
• Lebowitz, J. L. (1993). Boltzmann’s entropy and time’s arrow. Physics Today, 46, 33–38.
• Lebowitz, J. L. (1999a). Microscopic origins of irreversible macroscopic behavior. Physica A, 263, 516–527.
• Lebowitz, J. L. (1999b). Statistical mechanics: A selective review of two central issues. Reviews of Modern Physics, 71, 346–357.
• Leff, H. S., & Rex, A. F. (Eds.). (2003). Maxwell’s demon 2: Entropy, classical and quantum information, computing. Institute of Physics.
• Loschmidt, J. (1876). Über den Zustandes Wärmegleichgewichtes eines Systemes von Körpern mit Rücksicht auf die Schwerkraft. Sitzungberichte der Kaiserlichen Akademie der Wissenschaften zu Wien, mathematisch-naturwissenschaftliche Klasse, 73, 128–142.
• Lostaglio, M. (2019). An introductory review of the resource theory approach to thermodynamics. Reports in Progress in Physics, 82, 114001.
• Magie, W. F. (Ed.). (1899). The second law of thermodynamics: Memoirs by Carnot, Clausius, and Thomson. Harper & Brothers.
• Malament, D. B., & Zabell, S. L. (1980). Why Gibbs phase averages work—The role of ergodic theory. Philosophy of Science, 47, 339–349.
• Maroney, O. (2005). The (absence of a) relationship between thermodynamic and logical reversibility. Studies in History and Philosophy of Modern Physics, 36, 355–374.
• Maroney, O. (2009). Generalizing Landauer’s principle. Physical Review E, 79, 031105.
• Maudlin, T. (2020). The grammar of typicality. In V. Allori (Ed.), Statistical mechanics and scientific explanation (pp. 231–251). World Scientific.
• Maxwell, J. C. (1871). Theory of heat. Longmans, Green.
• Maxwell, J. C. (1873). Molecules. Nature, 8, 437–441 (Reprinted in Niven, 1890, pp. 361–377).
• Maxwell, J. C. (1878). Tait’s “Thermodynamics,” II. Nature, 17, 278–280 (Reprinted in Niven, 1890, pp. 665–671).
• Maxwell, J. C. (1879). On Boltzmann’s theorem of the average distribution of energy in a system of material points. Transactions of the Cambridge Philosophical Society, 12, 547–570. (Reprinted in Niven, 1890, pp. 713–741).
• Myrvold, W. C. (2011). Statistical mechanics and thermodynamics: A Maxwellian view. Studies in History and Philosophy of Modern Physics, 42, 237–243.
• Myrvold, W. C. (2012). Deterministic laws and epistemic chances. In Y. Ben-Menahem & M. Hemmo (Eds), Probability in physics (pp. 73–85). Springer.
• Myrvold, W. C. (2020). The science of $ΘΔcs$. Foundations of Physics, 50, 1219–1251.
• Myrvold, W. C. (2021a). Beyond chance and credence. Oxford University Press.
• Myrvold, W. C. (2021b). Shakin’ all over: Proving Landauer’s principle without neglect of fluctuations. The British Journal for the Philosophy of Science.
• Neumann, C. (1891). Bemerkungen zur mechanischen theorie der wärme. Berichte Über Die Verhandlungen Der Königlich Sächsischen Gesellschaft Der Wissenschaften Zu Leipzig, 43, 75–156.
• Ng, N. H. Y., & Woods, M. P. (2018). Resource theory of quantum thermodynamics: Thermal operations and second laws. In F. Binder, L. A. Correa, C. Gogolin, J. Anders, & G. Adesso (Eds.), Thermodynamics in the quantum regime: Fundamental aspects and new directions (pp. 625–650). Springer.
• Niven, W. D. (Ed.). (1890). The scientific papers of James Clerk Maxwell (Vol. 2). Cambridge University Press.
• Norton, J. D. (2005). Eaters of the lotus: Landauer’s principle and the return of Maxwell’s demon. Studies in History and Philosophy of Modern Physics, 36, 375–411.
• Norton, J. D. (2011). Waiting for Landauer. Studies in History and Philosophy of Modern Physics, 42, 184–198.
• Norton, J. D. (2013a). Author’s reply to Landauer defended. Studies in History and Philosophy of Modern Physics, 44, 272.
• Norton, J. D. (2013b). The end of the thermodynamics of computation: A no-go result. Philosophy of Science, 80, 1182–1192.
• Norton, J. D. (2013c). All shook up: Fluctuations, Maxwell’s demon and the thermodynamics of computation. Entropy, 15, 4432–4483.
• Norton, J. D. (2018). Maxwell’s demon does not compute. In M. E. Cuffaro & S. C. Fletcher (Eds.), Physical perspectives on computation, computational perspectives in physics (pp. 240–256). Cambridge University Press.
• Palacios, P. (2019). Phase transitions: A challenge for intertheoretic reduction? Philosophy of Science, 86, 612–640.
• Palacios, P., & Valente, G. (2021). The paradox of infinite limits: A realist response. In T. D. Lyons & P. Vickers (Eds.), Contemporary scientific realism: The challenge from the history of science (pp. 312–349). Oxford University Press.
• Penrose, O. (2001). The direction of time. In J. Bricmont, D. Dürr, M. Galavotti, G. Ghirardi, F. Petruccione, & N. Zanghì, N. (Eds.), Chance in physics (pp. 61–82). Springer.
• Penrose, O., & Percival, I. C. (1962). The direction of time. Proceedings of the Physical Society, 79, 605–616.
• Pitowksy, I. (2012). Typicality and the role of the Lebesgue measure in statistical mechanics. In Y. Ben-Menahem & M. Hemmo (Eds.), Probability in physics (pp. 41–58). Springer.
• Plancherel, M. (1913). Beweis der Unmöglichkeit ergodischer mechanische Systeme. Annalen der Physik, 42, 1061–1063.
• Planck, M. (1897). Vorlesungen Über Thermodynamik. Verlag Von Veit.
• Planck, M. (1901). Ueber das Gesetz der Energieverteilung im Normalspektrum. Annalen der Physik, 4, 553–563.
• Planck, M. (1910). Acht Vorlesungen über Theoretische Physik gehalten an der Columbia University in the City of New York im Frühjahr 1909. S. Hirzel (English translation in Planck, 1915).
• Planck, M. (1913). Vorlesungen Über die Theorie der Wärmestrahlung (2nd ed.). Verlag von Johnann Ambrosius Barth (English translation in Planck, 1914).
• Planck, M. (1914). The theory of heat radiation. P. Blakiston’s Son (English translation, by Morton Masius, of Planck, 1913).
• Planck, M. (1915). Eight lectures on theoretical physics delivered at Columbia University in 1909. Columbia University Press (Translation, by A. P. Wills, of Planck, 1910).
• Planck, M. (1922). Vorlesungen Über Thermodynamik (7th ed.). Walter de Gruyter.
• Poincaré, H. (1890). Sur le problème des trois corps et les équations de dynamique. Acta Mathematica, 13, 8–270.
• Poincaré, H. (1893). Le mécanisme et l’experience. Revue de Métaphysique et de Morale, 1, 534–537.
• Poincaré, H. (1902). La science et l’hypothèse. Flammarion. (English translation in Poincaré, 2018)
• Poincaré, H. (2018). Science and hypothesis. Bloomsbury Academic (Translation, by M. Frappier, A. Smith, & D. J. Stump, of Poincaré, 1902).
• Price, H. (2002). Boltzmann’s time bomb. The British Journal for the Philosophy of Science, 53, 83–119.
• Rosenthal, A. (1913). Beweis der Unmöglichkeit ergodischer Gassysteme. Annalen der Physik, 42, 796–806.
• Rosenthal, J. (2010). The natural-range conception of probability. In G. Ernst & A. Hüttemann (Eds.), Time, chance and reduction: Philosophical aspects of statistical mechanics (pp. 71–91). Cambridge University Press.
• Rosenthal, J. (2012). Probabilities as ratios of ranges in initial-state spaces. Journal of Logic, Language, and Information, 21, 217–236.
• Saunders, S. (2013). Indistinguishability. In R. Batterman (Ed.), The Oxford handbook of philosophy of physics (pp. 340–380). Oxford University Press.
• Saunders, S. (2018). The Gibbs paradox. Entropy, 20, 552.
• Saunders, S. (2020). The concept “indistinguishable.” Studies in History and Philosophy of Modern Physics.
• Savage, L. J. (1973). Probability in science: A personalistic account. In P. Suppes (Ed.), Logic methodology, and philosophy of science IV (pp. 417–428). North-Holland.
• Sharp, K., & Matschinsky, F. (2015). Translation of Ludwig Boltzmann’s paper “On the relationship between the second fundamental theorem of the mechanical theory of heat and probability calculations regarding the conditions for thermal equilibrium.” Entropy, 17, 1971–2009 (English translation of Boltzmann, 1877b).
• Sklar, L. (1993). Physics and chance. Cambridge University Press.
• Skordos, P. (1993). Compressible dynamics, time reversibility, Maxwell’s demon, and the second law. Physical Review E, 48, 777–784.
• Strevens, M. (2003). Bigger than chaos: Understanding complexity through probability. Harvard University Press.
• Strevens, M. (2011). Probability out of determinism. In C. Beisbart & S. Hartmann (Eds.), Probabilities in physics (pp. 339–364). Oxford University Press.
• Swendsen, R. H. (2006). Statistical mechanics of colloids and Boltzmann’s definition of the entropy. American Journal of Physics, 74, 187–190.
• Swendsen, R. H. (2018). Probability, entropy, and Gibbs’ paradox(es). Entropy, 20, 450.
• Thomson, W. (1874). The kinetic theory of the dissipation of energy. Nature, 9, 441–444 (Reprinted in Thomson, 1911, pp. 11–20).
• Thomson, W. (1911). Mathematical and physical papers (Vol. V). Cambridge University Press.
• Tolman, R. C. (1938). The principles of statistical mechanics. Clarendon Press.
• Uffink, J. (2011). Subjective probability and statistical physics. In C. Beisbart & S. Hartmann (Eds.), Probabilities in physics (pp. 25–49). Oxford University Press.
• Uhlenbeck, G. E., & Ford, G. W. (1963). Lectures in statistical mechanics. American Mathematical Society.
• van Kampen, N. G. (1984). The Gibbs paradox. In W. E. Parry (Ed.), Essays in theoretical physics in honour of Dirk ter Haar (pp. 303–312). Pergamon Press.
• van Lith, J. (2018). The Gibbs paradox: Lessons from thermodynamics. Entropy, 20, 328.
• Venn, J. (1866). The logic of chance. Macmillan.
• Versteegh, M. A. M., & Dieks, D. (2011). The Gibbs paradox and the distinguishability of identical particles. American Journal of Physics, 79, 741–746.
• von Mises, R. (1928). Wahrscheinlichkeit Statistik und Wahrheit. Springer Verlag.
• von Neumann, J. (1966). Theory of self-reproducing automata (A. W. Burks, Ed. & Comp.). University of Illinois Press.
• Wallace, D. (2018a). The case for black hole thermodynamics part I: Phenomenological thermodynamics. Studies in History and Philosophy of Modern Physics, 64, 52–67.
• Wallace, D. (2018b, June 21). Thermodynamics as control theory [Lecture delivered at conference]. Thermodynamics as a resource theory: Philosophical and foundational implications, University of Western Ontario.
• Wallace, D. (2019). The case for black hole thermodynamics part II: Statistical mechanics. Studies in History and Philosophy of Modern Physics, 66, 103–117.
• Werndl, C. (2013). Justifying typicality measures of Boltzmannian statistical mechanics and dynamical systems. Studies in History and Philosophy of Modern Physics, 44, 470–479.
• Werndl, C., & Frigg, R. (2020). When do Gibbsian phase averages and Boltzmannian equilibrium values agree? Studies in History and Philosophy of Modern Physics, 72, 46–69.
• Wiedeberg, O. (1894). Das Gibbs’sche paradoxon. Annalen der Physik, 289, 684–697.
• Zermelo, E. (1896). Ueber einen Satz der Dynamik und die mechanische Wärmetheorie. Annalen der Physik, Neue Folge, 57, 485–494 (Reprinted, with English translation, in Zermelo, 2013, pp. 214–228).
• Zermelo, E. (2013). Collected works/Gesammelte Werke (Vol. II). Springer.
• Zhang, K., & Zhang, K. (1992). Mechanical models of Maxwell’s demon with noninvariant phase volume. Physical Review A, 46, 4598–4605.