-------------------------------------
What is the meaning of probabilities?
-------------------------------------
To say that
"The probability that someone in risk group A will die of cancer is 1/3"
does _not_ mean that
"10 out of 30 people in risk group A will die of cancer".
It only means that,
"on the average, 10 out of 30 randomly chosen people in risk group A
will die of cancer".
(Indeed, a probability is nothing but an averaged relative frequency,
the average being taken in the sense of the mathematical expectation,
i.e., the mean of the complete ensemble of interest.)
Given a reliable stochastic model, such a statement can be checked
(in the limit) by many repeated simulations, or (directly) by a
theoretical computation; both require that the complete ensemble is
available. On the other hand, the reliability of the model can be
checked only by monitoring a large group of people and by counting how
many will die of cancer.
Of course, in using probabilities for predictive purposes, an insurance
company tacitly assumes (without any guarantee) that the group of 30
people of interest is actually well approximated by a random sample,
so that one can expect 10 out of the 30 to die of cancer. But this
tacit assumption may well turn out to be wrong.
Statements about ensembles are in principle exactly checkable:
Operationally, to say that "The probability that someone in
risk group A will die of cancer is 1/3" means nothing more or less
than that exactly 1/3 of _all_ people in risk group A will die of
cancer.
(This assumes that risk group A is finite. For infinite ensembles,
to define the precise meaning of '1/3 of all', one needs to go into
technicalities leading to measure theory. Indeed, measures are the
mathematically rigorous versions of 'classical ensembles' in general.
For quantum ensembles, see quant-ph/0303047.)
Of course, we cannot check this before we have information about
how _all_ people in risk group A died, but once we have this
information, we can check and verify or falsify the statement.
In terms of precise mathematics: A classical ensemble is the set of
elementary events underlying the sigma algebra over which the measure
is defined. For example, in any finite sigma algebra containing random
variables representing a fair coin (realizations 0,1; 1=head)
with probability 50%), one has a finite ensemble of elementary events,
and exactly half of them come out heads. For an infinite sigma algebra,
the ensemble is infinite; but with the natural weighting, again exactly
half of them come out head.
Usually, however, we only have incomplete knowledge about the ensemble.
For example, 'Tossing 10 fair coins' is just a sloppy way of saying
'Selecting a sample of size 10 from the total ensemble'.
The sigma algebra for modeling this must contain at least 10 indepemdent
random variables representing fair coins. This is the case, e.g., in the
direct product of N>=10 sigma algebras isomorphic to 2^{0,1}. For N>10,
it is obvious that here the number of heads is 5 (=50%) only on
average over many random samples; and it is impossible to infer the
exact probability from a single sample.
This is why statisticians say that they _estimate_ probabilities
based on _incomplete_ knowledge, collected from a sample.
The resulting estimated probabilities are known to be inherently
inaccurate; but they can be checked approximately by independent data
(cross-validation) providing confidence levels indicating how much
the predictions can be trusted.
On the other hand, they _compute_ probabilities from _assumed_complete
knowledge about the ensemble, namely the theoretical probability
distribution. Thus if complete information goes in, exact information
comes out, while computations based on incomplete information
naturally only gives approximate results inheriting some uncertainty
from the input.
Computed probabilities are powerful, but only if the assumed stochastic
model is correct. Empirical estimates are usually inaccurate but useful.
The two approaches are not contradictory; indeed, they are combined in
practice without difficulties at all.
The only subjective aspect in the whole thing is the choice of a
stochastic model when making theoretical predictions; and even this
is made almost objective by the standard rules of statistical
inference and model building.
Indeed, the choice of ensemble is _always_ a subjective act that
determines what the probabilities mean. It encodes what the user is
prepared to assume about the given situation. Once the ensemble is
chosen - either a theoretical, exactly known ensemble, defined by
specifying a distribution, or as a real life ensemble of which only
a (perhaps growing) sample is available, all probabilities have an
objective meaning.
A chosen ensemble is knowledge precisely if it is close to the correct
ensemble, and we have a good idea of how close it is.
That's why we value highly scientists such as Gibbs who guessed
the right ensembles for statistical mechanics, which turned out to be
a highly accurate description of equilibrium situations.
Only good choices are knowledge.
And what is good is found out only through proper checking,
and not through the principle of insufficient reason.
In case of tossing a coin we know that the fairness assumption is
usually reasonable, being consistent with experience.
In case of taking an exam at a newly appointed professor about whom
no one knows anything, reasoning from the two possible outcomes
(pass or fail) and the principle of insufficient reason to assign
a probability of 50% failure is ridiculous, and dangerous for those
who are not prepared.