Entropy and missing information

How is this notion of information related to information in terms of entropy?

Informally, entropy is often equated with information, but this is not correct - entropy is _missing_ information!

More precisely, in the statistical interpretation, the state belongs not to a single particle but to an ensemble of particles. Entropy measures the amount of information missing for a complete probabilistic description of a system.

Entropy is the mean number of binary questions that must be asked in an optimal decision strategy to determine the state of a particular realization given the state of the ensemble to which it belongs. See Appendix A of my paper

A. Neumaier, On the foundations of thermodynamics, arXiv:0705.3790 http://lanl.arxiv.org/abs/0705.3790 The formula for the entropy S found in every statistical mechanics textbook is, for a system in a mixed state described by the density matrix rho, S = <kbar log rho> where <f> = Tr rho f and kbar is Boltzmann's constant. (I use the bar to be free to use k as an index.) In any representation where rho is diagonal, rho = sum_k p_k |k><k|, this gives S = kbar sum_k p_k log p_k; also, since <1>=1 and rho is positive semidefinite, sum_k p_k = 1 , all p_k >= 0. Thus p_k can be consistently interpreted as the probability of the system to occupy state |k>. This probability interpretation depends on the orthonormal basis used to represent rho; which basis to use is a famous and not really solved problem in the foundations of quantum mechanics.

For a pure state psi, rho has rank 1, and the sum extends only over the single index k with |k> = psi. Thus in this case, p_k = 1 and S = kbar 1 log 1 = 0, as it should be for a state of maximal information. The amount of missing information is zero.

For more along these lines, and in particular for a way to avoid the probabilistic issues indicated above, see Sections 6 and 12 and Appendix A of my paper

A. Neumaier, On the foundations of thermodynamics, arXiv:0705.3790 http://lanl.arxiv.org/abs/0705.3790
But how does the infinite amount of information in a pure state (wave function) square with the finiteness of entropy?

Specifying a mixed state _exactly_ provides already an infinite amount of information, since the density matrix rho must be specified to infinite precision. Defining the eigenstates that are of interest in measurement amounts to specifying a Hamiltonian operator H _exactly_, which again provides already an infinite amount of information, since the coefficients of H in an explicit description must be specified to infinite precision.

Then only a finite amount of information is missing to determine in which of the eigenstates a particular particle is.

Of course in practice one just _postulates_ rho and H based on a finite number of measurements, and _pretends_ (i.e., procedds as if) they are known exactly, while knowing well that one knows them only approximately.

In practice, a number of approximations are made. Frewquently, one postulates exact equilibrium, hence a grand canonical ensemble, which of course is not exactly valid. Deviations from equilibrium are handled by means of a hydrodynamical approximation, in which entropy is no longer a number but a field - and specifying the entropy density again requires an infinite amount of information. Of course, one also represents this only to some limited accuracy, to keep things tractable.

Thus finiteness of the entropy in a particular model is enforced by making simplifying assumptions which are valid only if one doesn't look too closely.

Indeed, as the Gibbs paradox (discussed, e.g., as Example 9.1 in my above thermodynamics paper) shows, the amount of entropy depends on the level of modeling.

An analogy contributed by Gerard Westendorp: To describe a classical, slightly biased die exacltly by a probability distribution also requires an infinite amount of information, namely the specification of 5 infinite decimal expansions of the probabilities p_k for getting k eyes. (The sixth is the determined since probabilities sum up to 1.) This is much more than the finite amount of information in saying which particular value k was obtained in a specific die. On the other hand, _given_ the distribution, the entropy S = - sum p_k log p_k is finite.

In general, describing the probabilistic state of an ensemble exactly requires much more information than the exact description of a particular realization.