Entropy and knowledge

This entry tries to dispel some widespread myths related to entropy and its interpretation in the context of statistical physics and thermodynamics.

An idea that can be found in the literature is that entropy is a property of a statistical ensemble of (many identically prepared) systems, and cannot be associated with a single system. It originates from a particular interpretation of statistical mechanics, an interpretation full of foundational difficulties. (See

for a discussion of these foundations from a perspective of classical mechanics. The quantum version is even more muddled, in view of the additional problems caused by the interpretation of quantum mechanics.)

But thermodynamics does not depend on the concept of an ensemble. It has a clear and very successful concept of entropy, applicable (and routinely applied) to single systems.

As every chemical engineer knows, each _single_ glas of water (in canonical contact with the surrounding) has an entropy easily computable from temperature, volume and pressure and the free energy expression for water, according to well-known formulas.

Any interpretation of quantum mechanics that is adequate not only for microsystems but also for macrosystems must account for the fact that thermodynamics is routinely and successfully applied to _single_ systems. (See also the entry ''Does quantum mechanics apply to single systems?'' elsewhere in this FAQ.)

As everywhere in statistics, and independent of whether a system is described by classical mechanics or by quantum mechanics. quantities with small fluctuations are determined to the corresponding accuracy in _every_ single realization, with very high probability, high enough that larger deviations are at least as likely attributable to modeling effects -- such as simplifying assumptions or measurement errors -- as to statistical fluctuations computed from the model.

This applies equally to the length of a table, the weight of a coin, or the entropy of a glass of water.

Ensembles are needed only to account for the distribution of quantities whose uncertainty is comparable in size to the experimental accuracy of individual measurements.


Sometimes, it is claimed (even by otherwise very competent people) that entropy is not a property of a system, but a property of our knowledge of a system (or, equivalently, of the 'ensemble' within which it is viewed as lying).

But this is not true. For example, according to standard engineering practice, derived rigorously from statistical mechanics, the entropy of water at equilibrium has a value computable from a few state variables characterizing the equilibrium state, independent of whether the ensemble considered is microcanonical, canonical, or grand canonical. (The thermodynamic limit erases these differences.)

Nowhere ''our'' (whose??) knowledge figures as input. Thus the entropy is knowledge-independent.

Knowledge is needed only to know whether, to an accuracy deemed adequate, a system considered is or is not in an equilibrium state. Thus the role of knowledge is only to determine whether a description is applicable to the system under consideration.


In this situation, defenders of the maximum entropy principle argue that energy and particle number were chosen as macrovariables because they correspond to the known experimental constraints. If one had different information about the water concerned, one would have a different ensemble and a different entropy.

This is of course true for purely logical reasons, because the hypothesis can never be satisfied. One _cannot_ obtain information about a system in equilibrium different from that contained in the few numbers needed to determine its thermodynamic state.

One can, however, obtain _less_ information. But if one uses this less information (forgetting to account for some conserved quantity, say) together with the maximum entropy principle, one gets a completely different equation of state, and hence completely _wrong_ results.


On the other hand, one can question whether the system is in equilibrium, and make a different assumption about the system. Then one has a model with more degrees of freedom, and can try to determine its nonequilibrium state. If the system was in equilibrium to high accuracy, this will only reproduce the previous result, with tiny corrections.

Or one question whether the system is described well by the usual equation of state for water, and consider a refined model in which H_2O, HDO, and D_2O are distinguished. For usual water, one will still get essentially the same results. But when the water had been specially prepared to contain uncommon amounts of deuterium, one can find measurable deviations, and may get a different value for the entropy. But this is proof that the simpler model was inadequate, and hence the entropy calculation based on it was misleadiung.


One of the authorities often quoted in this context is Jaynes, and his discussion of the Gibbs paradox, available, e.g., at http://www.physics.rutgers.edu/~wdwu/351/gibbs.paradox.pdf I found it not convincing.

His analysis is based on the very questionable assumption that a model that clearly does not match experimental results should nevertheless be considered adequate for the computation of the entropy.

Nowhere else in physics would such an assumption be tolerated. If one calculates the energy of a damped harmonic oscillator from a simplified model for an undamped harmonic oscillator one should not be surprised if the derived energies differ.

It would be very strange to attribute this difference in energy to a difference in the knowledge of the observer. But it can be attributed to a lack of expertise or lack of care in the model selected to describe the system.


To be able to keep the discussion focussed, we need to be clear about what we mean when talking about a physical system. A physical system is characterized by its kinematics (choice of observable algebra) and its dynamics (choice of equation of motion).

This is the only conceptually clear way of saying what a physical system is, from the theoretical point of view. From the practical point of view, one must judge whether a particular real-life situation is sufficiently well modelled by the system description chosen.

It seems to me that this adequately describes actual practice, and allows a rigorous discussion of the matter.

Of course, one also applies the term ''physical system'' to particular real-life situations before one has adequate models. But it is impossible to discuss these physical system precisely with the tools of physics without making corresponding modeling assumptions, which turn the system into a physical system of the kind I defined.

Without an adequate model, concepts such as entropy (or any other concept from theoretical physics) do not make sense at all. But two models that are both adequate will have the same entropy, as I demonstrated in the discussion of the Gibbs paradox in Chapter 7 of my book

One can find there (among many other things) detailed discussions
  • of the Gibbs paradox (which is the best-known experimental situation that allows one to probe the basic properties of entropy),
  • of the maximum entropy principle (which is often invoked mistakenly to justify the view that entropy depends on our knowledge of a system), and
  • of the inadequacy of a subjective probability approach to physics,


    Frequently, positive entropy is associated with the idea of an incomplete description of a physical system. But all our descriptions are incomplete, so this is an empty idea.

    No physicist studying a pendulum is considering it as a system of 10^{25} atoms. The system is _defined_ by a point mass, a curve on which it is moving, and (depending on the detail of modeling) and assumption about friction.

    Everybody knows that this is an idealization of the description of a real pendulum, which has further degrees of freedom, such as its color, its shape, and the medium in which it is moving.

    But something similar is the case for every physical system. It is _always_ an incomplete description of reality. If we insist on a complete description, we cannot do physics.

    Every physical system that has been considered in the past and that will be considered in the future is defined by telling which observables are supposed to be accessible; everything else is idealized away. Nevertheless, one considers the description to be complete within the approximations made. Otherwise we could not do physics without reducing everything to the still unknown unified theory. And who knows whether this description, when found, is not missing further submicroscopic degrees of freedom....


    The intuition behind the argument based on incompleteness is that the entropy of a physical system is positive once the system is part of a bigger system with which it is interacting. By tracing out the environment one throws away information about it, and -- equating entropy with a measure of lack of information -- one concludes that the entropy should have increased.

    But in this case, we have two different systems: the small system A and the bigger system B containing the stuff with which A is interacting. These two systems have most likely different entropies S_A and S_B. There is no general relation between these entropies. In general, it is physically possible to have S_A greater than, smaller than, or equal to S_B.

    Indeed, a smaller system does not need to have a bigger entropy. Indeed, some of the smallest systems (e.g., a photon in a fully polarized monochromatic beam) have zero entropy, although they incorporate very little information (only one qubit). Thus the argument has no weight.


    From another perspective, it is instructive to realize that the trace over the environment does not lose any information regarding the algebra of observables corresponding to the system A. Only information not available in the system A is traced over.

    By standard methods of nonequilibrium dynamics one can write an exact dynamical equation (with memory) for the state rho_A(t) of A at time t in terms of rho_A(s) for s smaller t, without reference to B (though computing this dynamics of course requires knowledge of the dynamics of B and its state).

    Of course, one loses information about the bigger system (for example, how the small system is embedded in it). So the bigger system might have a lower entropy. But the bigger system contains many more observables, which may themselves contribute to the entropy of the bigger system. Therefore, in general, the entropy of the bigger system B is unrelated to the entropy of the smaller system A.


    In general, tracing out an environment relates two _different_ systems, namely the small system (without environment) and the big system (with environment).

    The relation is as follows. Let S and B denote the algebra of observables defined on the small and the big system, respectively. Then there are two states rho_S (in the small system) and rho_B (in the big system) such that the expectation of any observable A from S is given in two _fully_equivalent_ ways, either as

  • <A> = tr_S rho_S A, (1)
    or as
  • <A> = tr_B rho_B A. (2)
    For classical systems, the same holds provided that one replaces the traces by phase space integrals and the density matrices by phase space density functions.

    Typically, the small system is defined in such a way that rho_S is obtainable as a partial trace from rho_B. But since both (1) and (2) are exact identities, no information about the small system is lost when going from the big system to the small system. (What is lost is information how the small system relates to the big system, but this information is considered irrelevant in applications where only the small system is considered. The big system is only taken into account by means of appropriate boundary conditions.)

    But of course, there are two entropies: The entropy

  • S_B = - tr_B rho_B log rho_B
    of the big system, and the entropy
  • S_S = - tr_S rho_S log rho_S
    of the small system. These are different (and S_B may be smaller or larger than S_S), but this is no more remarkable than that the volume of a large system is higher than the volume of a small system.

    In particular, all this is completely independent of any knowledge anyone might have about the system.


    One may think that a coarse description should be regarded as an approximation only, and that the finest possible description has entropy zero. But how can one tell? It cannot be decided by experiment.

    The finest possible description must account for every detail in the whole universe, defined as the smallest system containing the earth and not interacting with something outside it. There is only a _single_ such system, and we know almost nothing about it (except for some information concentrated on a neighborhood of a particular family of past light cones).

    In particular, it is impossible to determine by experiments whether this universal system is in a pure or in a mixed state.

    The entropy of this system is zero only if we assume that this particular system is in a pure state. But since this can never be checked by experiment, it is a questionable assumption.

    No physicist ever considered this system in any detail. Cosmologists study only extremely coarse approximations to this system.

    All systems we know of experimentally -- and all systems ever considered quantitatively by physicists -- are vastly smaller than this unique universal system. They all have positive entropy, apart from very tiny systems with very few degrees of freedom.

    Thus the entropy of most systems is positive. The case of zero entropy is very, very singular, though at times it is a convenient idealization.

    The finest possible description therefore _only_ applies to the single universal system containing all information about the whole universe, and even then the argument about its zero entropy proves nothing but is basded on an unverifiable assumption.


    The sometimes held opinion that, since the coarse description incorporates less information, its entropy is higher, is therefore an illusion.

    There is no connection between the entropy of a system and the amount of information known about it by a particular ''knower'' (whoever this should be). Entropies are incomparable once they are about different systems. A smaller system does not need to have a bigger entropy. A coarser description is the description of a _different_ system. Otherwise all descriptions would have to be descriptions of the whole universe.


    Arnold Neumaier (Arnold.Neumaier@univie.ac.at)
    A theoretical physics FAQ