A rare genetic disease is discovered. Although only one in a million people
carry it, you consider getting screened. You are told that the genetic test is extremely good;
it is 100% sensitive (it is always correct if you have the disease) and 99.99% specific
(it gives a false positive result only 0.01% of the time).
Having recently learned Bayes' theorem, you decide not to take the test. Why? (From Durbin et.al. "Biological Sequence Analysis", Cambridge University Press, 1998) |
Bayes' Theorem states that for events X and Y: P(X|Y)=P(Y|X)*P(X)/P(Y).
We want to know the probability of being healthy(X) given the positive test(PT) results(Y). According to the Bayes' Theorem, P(healthy|PT)=P(PT|healthy)*P(healthy)/P(PT). From the problem we know that P(healthy)=1-0.000001=0.999999 and getting a false positive P(PT|healthy)=0.0001. The only unknown in the formula above is the probability of having a positive test P(PT). It can be calculated using the definition of marginal probability P(Y)=P(Y|Z1)*P(Z1)+...+P(Y|Zn)*P(Zn), where Zi, i=1...n are all possible events. In our case there are only two possible events: "being healthy" and "being sick". Therefore P(PT)=P(PT|healthy)*P(healthy)+P(PT|sick)*P(sick). From the problem we know that P(PT|sick)=1.0 (test is always correct in presence of the disease) and P(sick)=0.000001.
Substituting the numbers into the formula we get
P(PT)=0.0001*0.999999+1.0*0.000001=0.000101.
Finally, P(healthy|PT)=0.0001*0.999999/0.000101=0.990098, that is very close to 1. So, the probability of still being healthy given that the results of the test turned positive is above 99%. That is a good reason for not taking the test. |