This an exercise given in **Probability Theory: The Logic of Science** by Edwin Jaynes, 2003. There is a partial solution here. I have worked out a more general partial solution, and was wondering if anyone else has solved it. I will wait a bit before posting my answer, to give others a go.

Okay, so suppose we have $n$ mutually exclusive and exhaustive hypothesis, denoted by $H_i ;;(i=1,dots,n)$. Further suppose we have $m$ data sets, denoted by $D_j ;;(j=1,dots,m)$. The likelihood ratio for the ith hypothesis is given by:

$$LR(H_{i})=frac{P(D_{1}D_{2}dots,D_{m}|H_{i})}{P(D_{1}D_{2}dots,D_{m}|overline{H}_{i})}$$

Note that these are conditional probabilities. Now suppose that given the ith hypothesis $H_{i}$ the $m$ data sets are independent, so we have:

$$P(D_{1}D_{2}dots,D_{m}|H_{i})=prod_{j=1}^{m}P(D_{j}|H_{i}) ;;;; (i=1,dots,n);;;text{Condition 1}$$

Now it would be quite convenient if the denominator also factored in this situation, so that we have:

$$P(D_{1}D_{2}dots,D_{m}|overline{H}_{i})=prod_{j=1}^{m}P(D_{j}|overline{H}_{i}) ;;;; (i=1,dots,n);;;text{Condition 2}$$

For in this case the likelihood ratio will split into a product of smaller factors for each data set, so that we have:

$$LR(H_i)=prod_{j=1}^{m}frac{P(D_{j}|H_{i})}{P(D_{j}|overline{H}_{i})}$$

So in this case, each data set will "vote for $H_i$" or "vote against $H_i$" independently of any other data set.

The exercise is to prove that if $n>2$ (more than two hypothesis), there is no such non-trivial way in which this factoring can occur. That is, if you assume that condition 1 and condition 2 hold, then at most one of the factors:

$$frac{P(D_{1}|H_{i})}{P(D_{1}|overline{H}_{i})}frac{P(D_{2}|H_{i})}{P(D_{2}|overline{H}_{i})}dotsfrac{P(D_{m}|H_{i})}{P(D_{m}|overline{H}_{i})}$$

is different from 1, and thus only 1 data set will contribute to the likelihood ratio.

I personally found this result quite fascinating, because it basically shows that multiple hypothesis testing is nothing but a series of binary hypothesis tests.

**Contents**hide

#### Best Answer

For the record, here is a somewhat more extensive proof. It also contains some background information. Maybe this is helpful for others studying the topic.

The main idea of the proof is to show that Jaynes' conditions 1 and 2 imply that $$P(D_{m_k}|H_iX)=P(D_{m_k}|X),$$ for all but one data set $m_k=1,ldots,m$. It then shows that for all these data sets, we also have $$P(D_{m_k}|overline H_iX)=P(D_{m_k}|X).$$ Thus we have for all but one data set, $$frac{P(D_{m_k}|H_iX)}{P(D_{m_k}|overline H_iX)} = frac{P(D_{m_k}|X)}{P(D_{m_k}|X)} = 1.$$ The reason that I wanted to include the proof here is that some of the steps involved are not at all obvious, and one needs to take care not to use anything else than conditions 1 and 2 and the product rule (as many of the other proofs implicitly do). The link above includes all these steps in detail. It is on my Google Drive and I will make sure it stays accessible.

### Similar Posts:

- Solved – Sufficient statistics in the uniform distribution case
- Solved – MLE of $f(x;alpha,theta)=frac{e^{-x/theta}}{theta^{alpha}Gamma(alpha)}x^{alpha-1}$
- Solved – Finding maximum likelihood estimates of parameters of multiple normal populations
- Solved – Hypothesis test for composite null hypothesis of exponential parameter
- Solved – How to compare median survival between groups