The "Old Evidence" Problem

Answers from Bayesians

My own reply

Bibliography

Recently so called "Bayesianism" is getting popular as a formalization of scientific reasoning. However there have been many objections to Bayesianism, and Clark Glymour added a new one, namely the "old evidence" problem (1980). The purpose of this paper is to examine Glymour's and his opponents' arguments and suggest a new solution to the problem.

Bayesianism claims that Bayes's theorem gives a formal structure to inductive logic. Bayes's theorem ( BT) governs the relationship between a theory and its evidence;

(BT) P(T|E)=P(T)P(E|T)/P(E)

P(T) is called the *prior probability*, the probability of the theory before we get the evidence;
P(E) is the *expectedness* of the evidence, the degree the evidence is likely to happen; P(E|T) is the
*likelihood* of E, given that the theory T is true; P(T|E) is called the *posterior probability* of T, the
probability of T given that E has happened. If the theory predicts that something unexpected (P(E) is
low) will occur ( P(E|T) is high), and if this really happens, the theory will increase its authenticity
(so P(T|E)>P(T)); and the more the prediction is surprising ( P(E) gets closer to 0), the stronger
support this will give to the theory (P(T|E) gets closer to 1) These results seem to agree with our
intuition of the relation between a theory and its evidence. This theorem can be easily proved
from axioms of probability and the definition of conditional probability.

Another important claim of Bayesianism is that these probabilities should be regarded as
subjective, though rational. with this claim Bayesianism can avoid the problem of
assigning prior probability to a theory. We can assign this value according to our intuitive,
personal degree of belief (though, since it is probability, it should obey axioms of probability,
so, for example, it should be between 0 and 1; usually it also should not be 1 or 0; if P(T) = 0,
automatically P(T|E) = 0 and thus no evidence can support the theory; if P(T) = 1, every other
alternative theories should have probability 0). Such arbitrariness may seem to be against our
intuition about the objectivity of scientific theories. However, even if we start from a variety of
prior probabilities, we can get very close posterior probabilities by accumulating evidence (as likewise can be proven from Bayes's theorem). This result also agrees with our intuition of scientific
practices.

In spite of these advantages, Bayesianism has also several difficulties. One of the major
difficulties is put forward by Clark Glymour (1980, 85-93), and called "old evidence"
problem. In the history of science, we can find many examples of a theory seemingly being
confirmed by evidence already known before the theory was proposed. For example, the
anomalous advance of the perihelion of Mercury is usually considered as strong support for
the general relativity theory. When Einstein finally established his general relativity theory in
November 1915, this phenomenon was a well established fact (Glymour 1980,88). Several attempts were made to explain this phenomenon by Newtonian physics or the special relativity
theory, but they were not successful (Earman 1992, 132). In fact, one of Einstein's motivations
in constructing the general relativity theory was to explain this anomaly (ibid., 123). He was not
sure if his final version of field equations could completely explain the phenomenon in November
1915 (ibid., 115), and this relation was established later. The general relativity theory has other
factual supports such as the phenomena of light bending by gravity (and this is really a novel
fact that the general relativity theory predicted), but the perihelion phenomenon is considered as
the strongest support for the theory (Brush 1989). There are similar relationships between
Newton's law of gravity and Kepler's laws, or the special relativity theory and
Michelson-Morley experiment. In fact, these old evidences are regarded as crucial for the
acceptance of the theories.

Surprisingly enough, this common sense relation is hard to explain if we take (BT) literally.
First, we should assign probability 1 to an old evidence E (P(E)=1), because we know that it
happened already; and the probability of E under any theory T also should be 1 (P(E|T)=1). (This last
requirement needs a little explanation. As a theorem of probability, there is a relation
P(E)=P(T)P(E|T)+P(notT)P(E|notT)
Here no probabilities can be greater than 1, and P(T)+P(notT)=1. If P(E)=1, to meet all
these conditions both P(E|T) and P(E|notT) should be 1.) If we put these values into (BT), the
posterior probability of T and prior probability of T become the same (P(T|E)=P(T)). This
means that the old evidence has nothing to do with the authenticity of the theory and therefore
cannot confirm it. Thus, against the judgments by most scientists, the perihelion phenomenon
cannot support the general relativity theory. This seems ridiculous. The result is surprising
because Bayesianism is supposed to explain the relationship between theory and
evidence very well.

There are several answers from Bayesians, and most typical answers are Garber's (1983) and
Howson's (1985,1991).

Garber's answer admits that old evidence will not confirm a new theory. But when we formulate a new theory, we get new knowledge also, that is, knowledge that the theory (together with
other assumptions) implies the evidence. This relation of implication (or explanation) is
derivable from the knowledge we already have, so if we were logically omniscient, it would not
be new to us. Garber says that the logical omniscience assumption is false and unnecessary. If
we admit that we learn this kind of logical facts, the support from old evidence can be explained by such
learning. There is a supportive observation for this answer. If we deliberately make the new
theory to explain the evidence, i. e. when we already know the logical relation between the
theory and the evidence,
it is natural that the evidence does not improve the plausibility of the theory.

Howson, on the other hand, suggests a revision of Bayesianism. He says that when we evaluate
whether an old evidence E confirm a new theory T, we should reassign the prior probability of T
and the expectedness of E as if we did not know the evidence. We should imagine, for example, if
we did not know the result of the Michelson-Morley experiment, what probabilities should be assigned to the result and to the special relativity theory. And if we put these values into (BT), we
will find that this experiment strongly confirms the special relativity theory.

Both answers pose several problems. One objection to Garber concerns consistency with
the axioms of probability (Chihara 1987). Bayesianism requires that our subjective probabilities
obey axioms of probability. For example, one of the axioms states that if A is a logical truth, then
P(A)= 1. If we are not logically omniscient, we cannot employ this axiom. Another objection
maintains that there are cases in which we already know the logical relation, and at the same time the
evidence support the theory. Even if Einstein had known that the general relativity theory entails the
perihelion phenomenon, this phenomenon nevertheless would have increased the authenticity of
the general relativity theory. The situation is clearer when we take other people into account. As Earman
points out (Earman 1989, 333), for most of us, the relation between the perihelion phenomenon
and the general relativity theory is the first thing we learn about that theory. And nevertheless it
plays an important role in our acceptance of the theory.

Howson's answer also encounters several problems. One is already pointed out by Glymour himself
(Glymour 1980, Chihara 1987). There is something wrong with the idea of "counterfactual
degree of belief". How can we find such a degree? If we simply subtract E from our knowledge, that
is not enough, for we can derive E from other related knowledge. Imagine that we do not know the
result of the Michelson- Morley experiment, but we know the reaction from scientists after the
experiment; we can easily guess what happened in the experiment. We should subtract these
related knowledge, but to what extent? It is not easy to answer this question, and Howson
does not show a convincing answer. In addition to this formal problem, there is a further problem
with the applicability of Howson's answer to historical cases (Garber 1983, 103). When we
explain a historical case such as Einstein's, we are dealing with actual degrees of belief of
Einstein and other scientists. With these actual degrees (including, of course, the knowledge of
the old evidence), how did they reach their decisions on the authenticity of the general
relativity theory?

Before starting my reply, I should modify the question. Actually we have no reason to
assume that the probability of old evidence is 1. It is possible that our memories are
incorrect, or we made systematic mistake in measuring, or there is even a possibility of
collective hallucination. By counting these possibilities in, the probability of old evidence E
will become, say, P(E)=0.995. By this modification, technically the "old evidence" problem
vanishes (Earman 1992, 121). However, as Earman argues, this is not so helpful in itself (ibid.). To the degree that the P(E) is very
high, E cannot give strong support to T (one can check the reason by putting the value into
(BT) and calculating. Even if one assumes P(E|T) = 1, the prior and posterior probabilities remain
almost the same). Now, the question is this; with a very high probability for E, how can we get a
strong support to T? This small modification is nevertheless a necessary part of my argument.
The reason becomes clear later.

My own answer is a revision of Garber's answer by way of introducing another factor. First of all, the
old evidence situation is not a single situation. At least we can distinguish four different
situations as set forth below. (Hereafter I discuss the situation in which theory T entails E, i. e.,
P(E|T)=1. This is for simplicity; the results can be easily expanded to more general cases, i.
e., P(E|T)<1.)

(1) we already knew that T entails E before the theory formation, and E seems to confirm T.

(2) we already knew that T entails E before the theory formation, and E does not seem to confirm T.

(3) we learned that T entails E after the theory formation, and E seems to confirm T.

(4) we learned that T entails E after the theory formation, and E does not seem to confirm T.

First, consider the difference between situations (1) and (2). In the case of the
general relativity theory, no other theory was available to explain the perihelion
phenomenon. Thus, even if Einstein had known that his theory entails that phenomenon, the
theory would have had an advantage relative to other theories. This is the situation envisaged in (1). On the other hand,
the general relativity theory explained not only the perihelion phenomenon but also classic
phenomena such that planetary orbits are almost
ellipse. These phenomena do not seem to
confirm the general relativity theory because the Newtonian physics can also explain them very
well. This is the situation in (2). Garber does not recognize situation (1). What is the difference between (1) and (2)? My
answer is the difference depends on the existence of other available theories.

This same difference exists between situations (3) and (4). We have already seen many examples of (3).
To know when (4) applies, suppose that there was a phenomenon which was well known but was
not explained when the general relativity theory was established; and suppose that later it was
found that it can be explained by Newtonian physics, and thus also explained by the general
relativity theory. This phenomenon will not confirm general relativity theory at all, for the same
reason as situation (2). This is the situation in (4).

Is there any difference between the confirmations in (1) and those in (3)? Yes. Generally speaking, when we do not expect the entailment, the discovery of it
increases the authenticity of the theory remarkably. So the confirmatory power is stronger in (3) than in (1).

Can we deal with all these results in terms of Bayesianism? I think we can.

First, the difference between (1) and (3) is already explained by Garber. The difference
does not come from the evidence E itself, but rather from other evidence about the relation
between T and E. Among the objections I considered above, the last one (there are cases
where E seems to confirm T and, nevertheless, we already know the relation ) can be answered
if I can give (1) an independent support. I present the argument later. How about the logical
omniscience problem? Eells gives an answer to it (Eells 1990, 215-216). When we use subjective
probabilities, it is not our logical competency that we should change, but rather the
interpretation of the axioms of probability. Take the axiom "if A is a logical truth, then P(A)=
1". This should be interpreted as follows; "if I know A is a logical truth, then I should assign
P(A)=1". Here there is no need for logical omniscience. I think this answer is reasonable.
Secondly, when we compare (1) and (3) with (2) and (4), actually the "old evidence"
problem disappear. To explain this I would like to introduce a new concept, "relative confirmation". Usually
in Bayesianism we think that the evidence confirms the theory iff P(T|E)>P(T). This is not
necessarily the only path to confirmation; at least this is not a part of the "hard core" of
Bayesianism. My "relative confirmation" is defined as follows:

(RC) An evidence E relatively confirms a theory T iff P(T|E) is not less than P(T) and for any alternative theory Ti, P(Ti|E)<P(Ti).

Suppose we consider only three theories (T1, T2, T3); and suppose that T1 is relatively confirmed as
compared to T2 and T3. This means that the probability of T1 remains the same as before, and probabilities of T2 and T3 have decreased because of E. In this case, even if the degree of belief in T1 does not increase, we
will sense that it has increased. This can be put in another way: the proportion of the degree of
belief in T1 has increased relative to other theories. Bayesians can admit that this is a kind of
support. For we now have stronger reason than before to choose T1 instead of T2 or T3.

Old evidence can give this kind of relative support to a theory. Suppose T1 is the general
relativity theory and T2 is Newtonian physics. T1 entails E (here, the perihelion phenomenon),
so P(E|T1)=1. With T2 it is hard to explain E, but since there may be some overlooked factors
which help T2, so let us estimate P(E|T2)=0.25. P(E) is almost 1; say, P(E)=0.995 ( by the way,
if P(E)=1, it is impossible that P(E|T2)=0.25, as I showed above; this is why I modified the
question at the biginning of this section). With these values, we can calculate that P(T1|E) is
approximately 1.005 P(T1), and P(T2|E) is approximately 0.251 P(T2). This means the
probability of T2 reduces, while the probability of T1 remains almost the same as before. The
same thing happens between the general relativity theory and the special relativity theory. Only
these three theories were seriously considered at the time. This remarkable increase of relative
authenticity of the general relativity theory is the reason scientists accepted it. Such support can
be very strong, as we saw above.

To summarize, my answer proposes to supplement Garber's weak point by introducing the
notion of "relative confirmation". I think this answer is also an approximation of the correct
account, but hopefully this supplement strikes out for the right direction.

Brush, S. G. (1989) "Prediction and theory evaluation: the case of light bending", *Science 246*,
1124-1129.

Chihara, C. S. (1987) "Some problems for Bayesian confirmation theory", *The British Journal
for the Philosophy of Science 38*, pp. 551-60.

Earman, J. (1989) " Old evidence, new theories: two unresolved problems in Bayesian
confirmation theory", *Pacific philosophical Quarterly 70*, 323-340.

--(1992) *Bayes or Bust? : a critical examination of Bayesian confirmation theory*, The MIT
Press.

Eells, E. (1990) "Bayesian problems of old evidence", in C. Wade Savage (ed). *Scientific
theories, Minnesota Studies in the Philosophy of Science, Vol. X*, Minneapolis, University of
Minnesota Press, 205-223.

Garber, D. (1983) "Old evidence and logical omniscience in Bayesian Confirmation theory", in
J. Earman (ed). *Testing Scientific Theories, Minnesota Studies in the Philosophy of Science,
Vol. X*, Minneapolis, University of Minnesota Press.

Glymour, C. (1980) *Theory and Evidence*, Princeton, Princeton UP.

Howson, C. (1985) "Some recent objection to the Bayesian theory of support", *The British
Journal for the Philosophy of Science 36*, pp. 305-9.

--(1990) "Fitting your theory to the facts: probably not such a bad thing after all", in C. Wade
Savage (ed). *Scientific theories, Minnesota Studies in the Philosophy of Science, Vol. X*,
Minneapolis, University of Minnesota Press.

--(1991) "The 'old evidence' problem", *The British Journal for the Philosophy of Science 42*,
pp. 547-55.

Jeffrey, R. C. (1983) "Bayesian with a human face", in J. Earman (ed). *Testing Scientific
Theories, Minnesota Studies in the Philosophy of Science, Vol. X*, Minneapolis, University of
Minnesota Press.

Mayo, D. G. (1991) "Novel evidence and severe tests" *Philosophy of Science 58*, 523-552.

Niiniluto, I. (1983) "Novel facts and Bayesianism", *The British Journal for the Philosophy of
Science 34*, pp. 375-9.

Back to Recent Works

Back to Tetsuji Iseda Homepage

To INFORM Homepage

To WAM Homepage

If you have any comments, questions or anything else, please mail to tetsuji@wam.umd.edu .

<http://www.wam.umd.edu/~tetsuji/works/bayes.html> Last modified: July 13, 1996